o DOCUMENT RESUME 

ED 248 827 HE 017 889 

Adelroan, Clifford 

The Standardized Test Scores o£ College Graduates, 
1964-1982. ' 
National Inst, of Education (ED), Washington, DC. 

Dec 84"^ " 

lOip.; Prepared for the Study Group on the Conditions 
of Excellence in American Higher Education. For 
related document, see ED 246 833. 
Statistical Data (110) — Reports - 
Evauative/Feasibility (142) 

MF01/PC05 Plus Postage. 

Academic Achievement; *College Entrance Examinations;' 
*College Graduates; Graduate Study; Higher Education; 
Majors (Students); Professional Educatlpn; *Scores; . 
*Standardized Tests; Student Characteristics; *Test 
Results; Trend Analysis <. 
*Excellence In Education; Graduate Management 
Admission Test; Graduate Record Examinations; Law 
School Admission Test; Medical College Admission 
Test 

ABSTRACT 

Scores from 23 standardized tests that are used in 
application to graduate and professional schools are analyzed 
primarily from the 1964-1982 period-. The 23 examinations include, 
tests of advanced achievement in 15 subject areas, along with tests 
of , general learned abilitiLes (the Graduate Record Examination/Verbal ^ 
and Quantitative, the Law School Admissions Test, the Graduate 
Management Admissions Test, and Medical College Admission Test 
Reading and Quantitative Analysis subtests. Major conclusions 
include: (1) the quality of available data on test scores and on the 
background characteristics ol test-takers is highly variable; (2) 
changes in test scores over a period should be measured in terms of 
standard deviation units, and not in points or percentages; (3) of 23 
examinations, performance declined on 15, remained stable on 4, and 
advanced on 4 — the greatest declines occurred in subjects requiring 
high verbal skills; (4) none of the basic demographic characteristics 
of the test-takers (age, race, gender, citizenship, or native 
language), in themselves, explain the observed changes in performance 
over the period; and (5) different undergraduate majors provide 
convincing .explanations of observed changes in performance. Issues 
concerning the measurement of scaled test scores'^and the magnitude of 
change are addressed. Data on test performance are appended. 
( Author /SW) 



AUTHOR 
TITLE 

INSTITUTION 
PUB DATE 
NOTE 



PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



************************************************* 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
***********************************************************»>*********** 



Th« Standardized Test Scores of College Graduates • 

. 1964-1982 



Clifford Adelman 
Senior Associate 
National Institute of Education 



Prepared for 

The Study Group on the' Conditions of Excellence In 
American Higher Education 



December » 196A 



U.8. DEPARTMENT OF EDUCATION 

NATIONAL INSTITUTE OF EDUCATION 
EDUCATIONAL RESOURCES INFORMATION 

^ CENTER lERtC) 

(✓i^hjs 'documcni has be€n reproduced as 
received *'om the person Of organi/atton 
originating >t 

Minor changes have be«n m^de to improve 
reproduction quality 

• Points of view or opinions stated m this docu 
fDent do not nec«!«drilv represent official NIE 
position or policy 



-PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 




TO THE EPUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)/* 



The opinions, observations and conclusions offered In this psper are 
those of the author as an Individual. They should not be taken as 
representing the opinions or positions of the National Institute of 
Education, the U.S. Department of Education, the National Cowlsslon on 
Excellence in Education or any of the Conmiss loners, or the Study Group 
on the Conditions of Excellence in Aaerican. Higher Education or any of 
its MBbers. 



6 



TABLE OF CONTENTS 



Acknowledgements , 

Abstract 

Text ; 

1. Background and. Purposes of this Paper ' 1 

2. General Approach 3 

3. Limitations for Policy Makers and Trend Watchers 4 ^ 

4. Sources of Data 5 

5. Inconsistencies in Reporting Data 7 

6. Size of the Test-Taking Population 8 

7. Are Test Scores Reflective or Predictive? 9 

8. How Should We Measure Change in Scaled Test Scores? 12 

9. How Should We Judge the Magnitude of Change? 16 

10. Changes In Performance, 1964-1982 „ 16 

11. Three Periods of Change. 1964-1982 18 

12. Other Internal and Administrative Influences 21 

13. Explaining the Changes, 1: Numbers of Test-Takers 22 

14. Explaining the Changes, 2: Age, Race, Gender 24 

15. Explaining the Changes, 3: Citizenship, I^ati^e Language 26 

16. Explaining the Changes, 4: Undergraduate Major 30 

17. General Conclusions 35 

r. 

18. A Message to the Testing Services: Gathering 36 

Consistent Data 

19. A Message to the Commentators: a Plea Against Excuses 40 
Notes 43 



Bibliography 



Tables and Graphs 



46 

Appendices 



4 



ACKNOWLEDGEMENTS 



The author is deeply grateful for the extended commentaries on earlier 
drafts of this paper hj William Turnbull (former President and 
Distinguished Scholar In Residence at the Educational Testing Service) 
and Jonathan Warren (former Senior Research Scientist at ETS/Berkeley) , 
and for assistance in data collection and explanation of test 
development, reporting and allied procedures to Leonard Ramlst* Eldon 
Park, Cheryl Wild, Nancy Burton, Jerilee Grandy, Linda Cole, and 
Lawrence Hecht of ETS and Rennee Kennlsh- and Janice Wilson Plants of the 
Law School Admissions Service. For comments and suggestions on the 
final drafts I am grateful to Alice Irby, William Turnbull, Nancy Burton 
and Eldon Park of ETS, Zelda Gamson and Alexander As tin of the Study 
Group on the Conditions of Excellence in American Higher Education, and 
my NlE colleagues John Wirt and Charles Stalford. 1 also wish to thank 
Dr. Manuel Ji Justlz, Director of the National InstltUire of Education, 
for providing agency funds to analyze the LSAT data and for his Interest 
and support 'for this and. o£her NlE initiatives in the field of higher 
education. 



ABSTRACT 



Thl8 Is a secondary analysis of ^available and/or published data on the performance of 
college graduates (and soon-to-be graduates) on 23 standardized tests used In the 
process of admission to. graduate and professional schools. For most of these tests » 
the analysis covers the years* 1964-1982. The 23 examinations include tests of general 
learned abilities (the Graduate Record Exs^inat Ion/Verbal and Quantitative » the Law 
School Admissions Test, the Graduate Management Admissions Test* and sub-tests in 
Reading and Quantitative Amalysls of the Medical College Admissions Test) and tests of 
advanced achievement in 15 specific subject areas. 

The approach taken Is purposefully heuristic— an attempt to Illustrate both the virtues 
and limitations of common-sense empiricism. Its major conclusions are: 

1) The quality of available data on test scores and the background characteristics of 
test-takers on these examinations is highly variable. The data have been 
inconsistently gathered and reported over the years. 

2) Some 550^000 U.S. citizens currently take these examinations every year. While the 
group is self-selected, it represents a significant sample of the potential pool; and 
we are Justified iti looking at its performance as reflective of the ch^ging quality of 
student learning in U.S. colleges and universities. At the same time, we would not be 
Justified in using these test scores as the primary Indicator of the quality of 
American higher education. 

3) Changes in test scores over a period such as this should be measured in terms of 
Standard Deviation Units, not points or percentages. 

4) Of 23 examinations, performance declined on 15 (principally GRE Subject Area tests), 
remained stable on 4 and advanced on 4. The greatest declines occurred in subjects 
requiring high verbal skills. 

5) There were three distinct historical periods of change: one of sharp decline 
(1964-1970), one of basing or reversal of trend (1970-1976) and one of more modest 
decline (1976-1982), These periods, can be explained, in part, by internal content 
adjustments of the tests and to changes in methods of test administration and scoring. 

6) Most of the relationships between numbers of test-takers and trends in scores on 
individual examinations over the period in question are counter-intuitive. We cannot 
explain the observed changes in performance with reference to gross numbers. 

7) None of the basic demographic variables — age, race, or gender of the test-takers 
— in and of itself, can explain the observed changes in performance over the period. 

8) Neither citizenship nor fluency in English, in themselves, can explain the observed 
changes in performance. Only in combination with undergraduate major do these 
variables begin to offer plausible hypotheses of Influence on test score trends. 

9) The performance and oarticlpation of U.S. students from different undergraduate 
majors appear to offer the most convincing explanation of observed clianges. Students 
with undergraduate majors in professional and occupational fields (the most rapidly 
growing group among both degree grantees and test-takers) underp erform all others. 

The paper concludes with a plea to the testing services to gather and report consistent 
data to help educators monitor the quality of undergraduate education, and provides 
some suggestions toward that end. The paper also urges that no one be excessively 
defensive about test results. The more excuses we make, the less likely our colleges 
and universities will focus their attention on the bottom line of undergraduate 
education — what students learn. 

6 



Standardized Test Scores of College Graduates 

1964-1§82 

Some Motes and Interpretations v 

Clifford Adelman, Senldr Associate i Program on Educational Policy 
and Organizatlont the National Institute of Education 



1, Background and Purposes of this Paper 

A number of the recent national Veports on the status of American 
edtjtcatlon have commented on the vldely-watched Indices of declining test 
scores of secondary school graduates. But only the National Commission 
on Excellence In Education (A Nation at Risk) and the Study Group on the 
Conditions of Excellence In American Higher Education (Involvement In 
Learning) mentioned the declining trends In the scores of college 
graduates on scandardised tests. As a staff member to both groups, the 
author was responsible for gathering the background data on test scores, 
and has commented elsewhere (1) on the lapllcatlpns of the trends in 
scores for postsecondary standards and Institutional assessment policy. 
But In the course of writing those commentaries, I developed a sense of 
little ease: It was obvious that the publicly accessible data on which I 
drew my conclusions were not very good, and the conclusions themselves 
were thus more polemical than they should have been. I resolved my 
frustrations with the data by saying. In effect, "well. If that's the 
way they report It, then this Is the only conclusion one can reach." 

This paper is Intended, In part, both to revise and expand upon those 
earlier analyses. It Is based on a great deal more data on both test 
scores and the background characteristics of test-takers, as well as 
discussions of potential Interpretations of this data with experts In 
the testing community. The ultimate Intention of this paper Is to plea 
for better data on both reported standardized test scores and the 
characteristics of test-takers, and for data that can be understood by a 
broad audience, as well as by small groups of technicians and clients 
with narrow Interests. More Importantly, It Is to urge the higher 
education community to use assessment data on college graduates to 
Improve our understanding of the Influence of college curricula on 
student learning. 

Given that residue of polemical purpose. It Is also Important, at the 
outset , to state what this paper Is not : . 

o It Is not a commentary on the generic uses of tests and test 
scores or particular uses of the tests under discussion; 

o It is not an analysis of the qualityi reliability, validity or any 
other technical property of the tests under discussion; 

o It is not an inquiry into the question of whether these parti- 
cular tests and the methods of assessment they assume are the 
best means of measuring college student learning. 

t 

The reader may find that some of these issues are Implicit in the 
discussion; but if this initial warning sign is flashed, I am confident 
that this paper will not be misused. 



page 2 



Even in the best of times* college educators do i^t like to employ 
scores on standardized examinations to reflect the quality of higher 
education. Since the objectives of baccalaureate education include far 
more than learning whatever it is that is measured by multlple'^oice 
examinations, and since life itself places a higher premium on careful 
Insight than on the speed of response that governs Aaerieen testing, 
their objection is understandable. The objection is understandable, 
too, in light of the fact that those who take examinations such as the 
Graduate Record Examinations (General and Subject Area), the Lav School 
Admissions Test, the Graduate Management Admissions Test, and the J; 
Medical College Admissions^ Test tend to have higher educational 
aspirations than their peers and are probably the more able students. 
The tests results, some might say, do not reflect the achievement of the 
average or belov average college student , hence should not be used to 
Judge the quality of American higher education. 

While granting the general validity of these objections, 1 would point 
out, first, that there is a distinction between measuring the quality of 
student learning and assessing the quality of higher education— the 
former being but one of a number of tasks that one would employ in the 
latter. One should note, too, that the graduate and professional 
programs, of our universities use these tests in much the sane way as do 
the undergraduate admissions off ices— sometimes with an even more 
actuarial bias.* As Lauren and Daniel Resnick have pointed out, 
"graduate schools, like employers, do not treat college diplomas as 
equivalent, although it is still cousidered somewhat impolite tr talk 
very openly about the differences in standards among colleges" (2)— for 
which .reason, in part, the tests are used in the graduate school 

4idmissions process. - At the least, they are « recognised conmon- — 

currency, something that we cannot say about either credits or grades. 
Somehow, though, we allow colleges and universities to criticize 
secondary education on the basis of admissions test score trends, and 
decline to comment on trends in graduate school admissions test scores. 

This paper attempts to organize and provide interpretive frameworks for 
publicly accessible measures of college student achievement— on 18 
years (1964-1982> of scores on such examinations as the Graduate 
Records, the Graduate Management Admissions Test, the Law School 
Admissions Test, and the Medical College Admissions Test (though I place 
less emphasis on the MCATs because the entire test battery was changed^ 
in 1977) . All of these examinations are taken predominantly by college 
graduates and soon-to-be graduates. 

The information that forms the core of the analysia is presented in six 
tables appended to this text: 



* For a comprehensive review and discussion of the uses of all these 
examinations in the graduate and professional school admissions process, 
see Rodney Skager, "On the Use and Importance of Tests of Ability in 
Admission to Fostsecondary Education," in Alexandra Wigdor and Wen(^ell 
R. Garner (eds.). Ability Testing! Uses, Consequences and Controversies . 
Washington, D.C.: National Academy Preas, 1982. Part II, pp. 286-314. 



ERIC . 



8 



page 3 



Table A presenta the baalc data on viean acorest Standard Deviations, 
and nuabar of teet-takera for the Verbal and Quantitative aections of 
the Graduate Record exaaa, for 15 Graduate Record Subject Area Tests, 
for the Lav SchooA Adniaaions Teat, for the Graduate Management 
Admiaaions Test, and for four of the sub-tests of the Medical 
Collage Admiaaions Test. 

Table B presents selected available data on the background 
characteristics of the GRE/General, LSAT, MCAT and GMAT test-takers 
in r«cent yeara, and focuses on key variables such as age, citizen- 
ship, native language, sex, race, and post-college experience. 

Table C Indicates the preferred method of determining trends in 
scaled acores on all the tests at iasue, and provides information 
concerning the scales themselves, and other data that bear upon our 
interpretation. 

Table D follows up on Table C by honing in on the changes in test 
scores as determined by the preferred method. 

Table E identifies turning points in the trends of test scores on ^ 
the Graduate Record examinations (Verbal, Quantitative, anc| 15 
Subject Area Tests) during the 19706. 

Table F illustrates the comparative performance of students, by 
college major, on the Verbal and Quantitative'lSections of the 
Graduate Record Examination, the Lav Sdhool Admissions Test and 
the Graduate Management Admissions Test, during the period 1977-1982. 

Other tables illustrate very specific Issues raised in this pinpler, for 
example, distinctions between the performance of U.S. citizcne: and 
foreign students on the GRE/General and GMATs, or changes in numbers of 
test-takers on the GRE Subject Area examinations following chaages In 
the method of test administration. ^ o ' 

2. General Approach 

My approach to analyzing this data is partially naive. That is, I. do 
not want to take the role of the psychomettician (test expert)— which I 
am not— rather, that of the informed citizen, the individual who tries 
to bring a common aense empiricism to bear on what otherwise appears as 
a chaotic collection of data. 1 want to look at acme numbers and try to 
reason through them in a way that would reflect knowledge of at least a 
modicum of the basic literature on testing. In following this heuristic 
journey through reported data, the reader should remember that the 
testing services have five constituencies: 

1) College admissions officers and graduate school admissions 
committees (the principal constituency which, aa noted below, 
determines the ways in which scores are reported); 

2) Students who take the various examinations (and their families); 

3) Policy-makers in schools and school districts, colleges, leglsla 
tures and executive offices, etc. who use test^scores In symbolic 

9 



page 4 



ways to guide analyses of existing conditions in education and 
proposed changes in educational policy; 

4) The general public (and the media), which also uses test scores 
in very raw and symbolic ways so as to create the environment for 
change or stability, an environment that influences the ways in 
which principals and school boards, college deans and presidents, 
state legislators and others will subsequently act; 

5) Test developers and administrators,, researchers and other 
« scholars. 

* ■ / 

In these comments, X am basically ^adopting the perspective of #3, and, 
in light of recent national assessments of the state of education, am 
naturally Interested in historical trends^ 

3, Limitations for Policy Makers and Trend Watchers 

But the way in which the testing services report scores and data on 
test-takers is not Intended for any of the above constituencies except 
the first. * 

That is, the primary clients of the testing services are admissions 
officers or committees or Graduate Deans who want to compare applicants 
to their institutions or programs in any one year. These clients are 
not interested in historical trends (nor are the students who take the ^ 
tests in any given year). . Their interests are wholly understandable; ' 
and the data are -generated for them. In fact, they own the testing 
programs and control tenting policies. 

# 

How does the client affect the way in which the data are repotted? One 
example from the college admissions level may suffice: a reader of an 
annual series of mean scores of College Board achievement tests will 
Ihoyice an~occaslonaI Tllj)«up oF^own"'20"p6ini8, let^lTs say— -In" what 
otherwise seems to be a fairly stable series of numbers. The explana-- 
tion provided by ETS la that a particular achievement test was re-scaled 
to the SAT in that particular year. Rescaling is a perfectly legitimate 
statistical procedure that removes the differences in verbal and 
mathematical ability among populations taking particular achievement ^ 
tests. It "re-sets^* the mean score on an achievement test to the mean^ 
SAT scores of those who took that achievement test. 

The college admissions officer is not interested in what that rescaling 
does to the historical trend of scores: he or she simply wishes an 
accurate way of comparing applicants who present different CEEB 
achievement teot scores. While there may be other reasons for rescaling, 
the fact is that it occurs with some regularity on achievement tests, at 
least on the college entrance level • As indicated by Internal ETS 
'vstudies, the cumulative effects of rescallngs are rather significant, 
though both the magnitude and direction of the effects differ by test. 
(3) 



10 



* ! - ' P.age 5 

The point of the example » thought is that because the data are not 1- 
, designed for ub» they are hard for us to Interpret. JIhuSf too > College 
^ , Board achievement test means are not comparable over any extended period 

of time» so ve cannot use them to reach conclusions about trends ^In the . ^ 

achievement o£ high school graduates in specific subject fields. 

On the other handf with the except Tons of the MCATSf none of the tests 

taken by applicants to graduate and professional schools was re-scaled 

after Its introduction (or in some cases^ rejvlsion) until 1982-3. A 

rescallng of the 6RE Sabject Area Tests was proposed and considered by 

the GRE Board in 1969, but was rejected on the grounds that across-field 

comparisons were not as necessary or helpful as historical patterns ^ 

within fields. So< it is wholly appropriate to look at historical trends 

in these scoresy aided by the data that have been collected since 1975-6 

on the background characteristics of the test-takers. In a sentence » 

this stability is one of the principal statistical justifications for ^ — t- 

undertaking this study. 

c 

4. Sources of Data 

This is a secondary analysis of existing d^ta» and the data used v€:re c 
drawn from a variety of sources. It is best to describe the sources 
first » and then to comment on the adequacy and qu^ity of the data 
provided in the course ,of our analysis. 

Graduate Record Examinations; General and Subject Area 

The basic data on mean scores » standard deviations and candidate 
volume were drawn from an unpublished chart provided by the 
Educational Testing Service, covering the years 1963 to 1983. More 
detailed data on the background characteristics of test takers and 
crossrrtabulations of scores according to different background vari- 
ables were drawn from the published series, A Summary of Data 

- Collected-fxom-Graduate Jtecoxd-KxaTninattogfljreat,. Taker.fi^uxlnfeJLgb!:u 

(hereafter referred to as GRE Summary) . This series has been 
published every year since 1975-1976 ; The data contained in this 
annual publication apply only to the GRE "General" Examinations 
(Verbal, Quantitative, and Analytic). If there is comparable public 
data for the Subject Area tests, I am unaware of it (though it is 
obvious from the Guide to the Use of the Graduate Record Examinations 
that at least some of^this data exists). The reader should note that 
I do not Include the GRE/Analytlc test scores in this analysis 
because the examination was introduced recently (1977) and (more 
Importantly) has undergone one major reconstitutlon since (in 1981). 
The amount of comparable performance data on the GRE/Analytic is 
hence too limited for our purposes. 



Graduate Management Admissions Test 

The basic data on mean scores and candidate volume for the years 
1965-1982 were drawn from an unpublished chart provided by the 
Educational Testing Ser.vlce* Standard Deviations for the years 



ERIC 



n 



page 6 



1965-1977 were obtained by telephone eonvereatlon with STS staff. 
STtandard deviations, other test score data, background characteris- 
tics of test-takers and cross- tabulations of scores by different 
background variables were drawn froffl an unpublished set of tables 
prepared each year since 1977-1978 under the copyright of the Gradu- 
ate Hanagement Admiisions Council, and used |iere with itr permission. 
The data for 1980-1981 were obtained from a published version of 
these tables, A Demographic Profile of Candidates Taking the Graduate 
Management Admission Test During 1980-1981 (Princeton. N.J.; Graduate 
Management Admissions Council, 1982). 

Medical College Admission Test 

All the data on mean sco^res, sub-test scores, standard deviations, 
candidate volume, background characteristics of test«-,taker8 and 
cross-tabulations of scores by different background characteristics 
were drawn from an annual series of reports prepared by the Division 
of Educational Measurement and Research of the Association of Ameri- 
can Medical Colleges, Medical College Admission Test; Percentile Rank 
Ranges for MCAT Areas of Assessment [and] Summary of Score Distribu- 
tions . This particular information on the background characteristic^ 
of MCAT test-takers is less detailed than that on the other examina- 
tions. I focus prl;.clpally on the annual reports since 1977, as that 
.was the year in which the current version of the MCATe was 
introduced. ' 

* 

Law School AdmlBsions Test ' 

The basic data on mean scores, standard deviations and candidate 
volume were drawn from an unpublished chart provided by ETS to the 
National Commission on Excellence in Education in 1983, and 
subsequently by the Law School Admissions Services (which took over 
the adml nlatratlon and record s of the LSATs fro m ETS in 1979). In 
addition, and with the perml8sion~or~the Law school Admissione 
.. Council, LSAS prepared a separate set of data analyses especially for 
this paper, covering characteristics of test rakers and 
cross-tabulations of scores by different background variables for the 
years 1975-1983. ^ 

On the basis of the LSAS data analysis, it was obvious to me that the 
ETS data excluded some test-takers who should have been Included in 
the aggregate numbers and Included other data that should have been 
excluded.* Thus, the LSAT data used in this paper are broken into 
two periods, by source: 1964-1974 (ETS) and 1975-1982 (LSAS). 



* We were able to control what was in the LSAT data from 1975-1982 by 
defining the universe as those who actually took and completed the 
examination (without subsequent cancellation of scores) at all regular 
administrations (including those who, for religious reasons, use special 
Monday administrations) . Por those individuals who took the LSAT more 
than once in a given testing year, only the most recent score (and 
accompanying background data) is included in our universe. This 
definition precluded some of the irregularities that appear to be in the 
pre-1975 data. 



page 7 



5. Inconsiif neiee in Rtporting Data 

As the LSAT example Indicates, there Is a conalderable lack of 
consistency Xn reporting hoth test scores and data concerning the 
test-taking population. A quantitative historian would spend years 
untangling the aess In order to obtain eoaparable data over tlae. 

So as not to be misinterpreted, however, I should stress that Individual 
scores reported to students and graduate or professional schools .are 
accurate. Those with a personal Interest (students or admissions 
-committees) should not J>e excessively concerned If the aggregate data 
are Inconsistently reported to those with academic or policy Interests. 
But we should be concerned, nonetheless, because policy decisions are 
often made on the basis of our perceptions of the aggregate data; 

For an example of Inconsistency; the reader will note that on Table B 
(data concerning the background characteristics of GRE, GMAT, MCAT, and 
LSAT test-takers) the figures for some variables are reported for all 
test-takers, while for others, the figures are those for first-time 
test-takers only, and for others, still, for first-time test takers who 
are U.S. cltlsens. One can reasonably assume that there will be a 
difference in the aggregate characteristics (and test scores) of all 
test-takars versus first-time teSt-takera, let alone other universes 
used in the reporting. The more complex charts used in the GRE Summary 
bear that assumption out. 

For example, again, in the annual GRE Summary there was no simple figure 
Indicating the percentage of women tegt-takers until 1978-1979 ! ! ! And 
as for the percentage of women aspiring to doctoral or post-doctoral 
study, the universes for reporting changed considerably from 1975-1979 
to 1980-preBent. 

While it is difficult to require the administrators of different 
examinations tojask background j|ueat-lons-in-the~BBffie~way, one is 
dlsmayed-that' In the simple matter of Identifying a student 'a 
vndergraduate major field, no two of the major examinations are 
comparable. As the reader will note later in this discussion, that 
inconsistency hampers our insight concerning the movement of students 
from one field to another^ 

Yet another complication (beyond the control of the testing services) is 
that not all the test-takers will answer all th« questions on background 
information quest ionnairea that accompany each teat. On some questions 
—depending on where they are and how they are phrased— the number of 
non-respondents Is significant. As ETS researcher Jerllee Grandy notes, 
"examinees who complete the background questionnaire . . . tend to have 
somewhat higher test scores," a fact that is particularly noticeable 
when one analyzes test scores by undergraduate major. (4) As long mb we 
understand this phenomenon, readers should not be frustrated or puzzled 
when they:,find that the subject-universe for one variable Is different 
from the subject universe for another, or where my tables disagree- as to 
the total number of test-takers in a given year or the mean scores and 
Standard Deviations ^or a given test. ^ 



13 



page 8 



6, Site of the Test-Taking Population 

Despite the Inconsistencies, there is another very sittplf statistical 
Justification for engaging in this inquiry: a lot of people take these 
tests. In the most recent year for which I an including data on all of 
the tests (1981-1982), the total eunber of test-takers (ainus those whom 
we know for sure are not graduates of U.S. colleges) was as follows: 

X Receiving 
Minus Non-U. S. Degree in 



Total Citizens* 1981 or 1982 

Grad. Record General Exaas: 256,381 17,291 54Z 

Grad. Management Adniss. Test: 203,304 38,807 A2Z 
Law School Admissions Test: 99,928 ????? 58X 
Medical College Admiss. Test: 47.597 ????? R.A. 



V - TOTAl,: 607,210 «- 56,098 - 551,112 

(In addition, 80,149 scores weire recorded on 19 GRE Subject Area 
tests. For purposes of thla calculation, I assume that all those who 
took the subject area tests also took the GRE General Exams and 
should not be counted twice). 

Is that number (about 550,000) substantial? The question Is less 
simple-minded than It appears. While there may be some overlap (thoagh 
it Is impossible to determine how 'many people take more than one of 
these tests), this number should be set against a "pot^^t^^l pool of 
examinees" determined as follows: (5) 

(a) Since half of the test-takers in 1981-82 received 
their bachelor's degrees in either ,1981 or 1982, we start 

with a base of bachelor *s degrees awarded in those years: 1,880,000 

(b) Since >wc are trying to "determlne-a~particlp»tloa rate _ . 

for U.S. citizens, we should remove from (a) all non-resident 
aliens who received bachelor's degrees in 1981 and 1982. (45,000) 

(c) Since, as table B shows, some 23Z of the GRE exsalnees 
and HI of the LSAT examinees were already enrolled in 
graduate or professional school* we should add a percentage 
of graduate and first-professional enrollments minus enroll- 
ments of non-resident aliens at those levels (1»386,C00 - 
91,000). I have taken 20Z as the product of aggregation based 

on the two examinations. Hence, 20Z of 1,295,000: 259,000 

Thus, one very rough measure of the potential pool of these test-takers 
is 2,094,000. When 551,000 take the tests, that means 26Z of the pool 
(v. 65Z of the pool of secondary school students who take either the 
^ATs or the ACTs)-. 



*The GRE data distinguish between non-resident and resident aliens. I 
use the non-resident figure. The GMAT, on the other hand, does not 
distinguish between resident and non-resident aliens. And neither the 
LSAT nor the MCAT report separate data for non-U. S. citizens. 



14 



page 9 



Most of the Adjustments that one sight uke to this very rough estimate 
would have the effect of lowering the number In the pool* hence raising 
the percentage of examinees. For example* William Tumbull writes that 
my estimate for graduate enrollments probably overstates the pool "since 
most of the enrolled graduate students who take the GRE are likely to be 
in their first year." (6) While I do not have the data to prove the 
case, Tumbull *s hunch Is probably right, and might bring the proportion 
of examinees up to 28% of the potentiixl pool. 

A small number of graduate departments do not require or recommend the 
GRE/6eneral examinations, but do require the Subject Area Tests. Thus, 
were we to "modify our original assumption and estimate that 10-152 of 
the Subject Area examinees did not take the GRE Generals, we would raise 
the number of test-takers, and add perhaps a half percentage point to 
thie proportion of axamlnees relative to the potential pool. 

Yet another adjustment would be based on the notion of a "likely pool." 
In other words. Instead of all baccalaureate degree recipients, we might 
eliminate those who received degrees in fields for which the 
baccalaureate has historically been the terminal degree. But that would 
Involve some arbitrary decisions and would load the dice. I would 
rather be conservative and, on the basis of historical data (at least as 
far back as 1971)*, posit a range of 25-30% of the pool. 

Let us not complexify this matter: 550 « 000 is a big number and what the 
statisticians call a "robust ssmple." It is a self-selected sample, and 
also one that is partly driven by the entrance requirements of 
professional schools and research universities and by the application 
requirements of fellowship sponsors. Virtually all accredited medical, 
law, and business schools require their respective examinations, and, as 
Table J indicates, with very few exceptions, 65% or more of the 
Ph.D. -gran ting graduate departments In the major disciplinary fields 
either require or recommend the GRE Subject Area tests (furthermore, 
these percentages have not changed significantly since 1971). So in 

-addltlon_tQ_jL.z:filLUJjLj!i!mbi^^^ 

population— at least In terms of presumed ability. Whatever quibbles we 
might have, this is the only substantial annual sample we possess from 
which to make some Inferences concerning the quality of undergraduate 
learning in the United States in recent years. 

7. Are Test Scores Predictive or Reflective ? 

Some may argue that the teats in question here are~air 'predlct Ive rathet 
than reflective. That is, they may say that most questions on an 
examination are designed to determine how well a student will perform at 
the next higher level of education, not how well a student has learned 
the material taught at his/her current level of education. 



* In 1971-72, for example, there were 546,145 test-takers, of which 
(though one must estimate from trends here) 518,500, were U.S. citisens. 
Using the same formula to determine the potential pool, we come up with 
1,961,000. The ratio of test-takers to the potential pool was thus 
26.4%— almost exactly the same as it was a decade later. 



15 



page 10 



FroB one point of vl«v, this dlttinction ii very subtle; from another, 
It is bogus. What I mean may be best illustrated by Jonathan Warren's 
conment that the 6RE Subject Area tests are "too generalised" to be 
reflective. . "Generalised," as Warren admits, is not a good term. 
Instead, one might say that in any discipline, there is a common d«fnoo- 
inator or "core" of subject matter that a graduate admissions comnittee 
expects a student to have mastered, and it is that common core that is 
reflected on the GRE Subject Area test. But some college departments 
pride themselves in specialties that lie outside that common core. 
Their students will thus not perform as veil on the examination as 
others. Warren thus suggests that our existing tests are unsuitable for 
program evaluation or assessment, and are not really reflective— even 
though (as I would add) the "core" is evidently sufficient for 
predictive purposes. (7) 

But are the examinations unsuitable? The Committee of Examiners for 
each GRE Subject Area Test recommends test specifications that weight 
different sub-fields within the overall set of questions. It then draws 
questions "from the courses of study most commonly offered." (8) For 
Political Science, for example, the specifications for the 170 questions 
used on current (1982-198A) versions of the GRE test are ac follows: 

"I. U.S. Government 30-35 1 

Questions will cover the major subfields of United States 
government, including institutions, processes of national 
and subnet ional politics and public administration. 

II. Comparative Political Systems 20-25X 
Some questions will be area or country specific; others 
will be concerned with comparative institutions and ad- 
ministrative processes. Questions will deal with 
developed^nd developing states. 

^Illr Interna tToiaaT derations 15-20% 

Questions will cover international politics, theory, 
organisations, lav and political economy as well as 
United States foreign policy. 

IV. Political Theory and History of Political Thought 20-25% 

Questions will deal with normative andjsmplr leal, con- 

_cept^ial-and-attelyt±c-matt«rsV~a8 well as the ideas of 

major political thinkers. 

V. Methodology 10* 
Questions will deal mainly with methods and techniques 
involved in empirical resesrch. Many of these questions 
will also involve the subfields described above." (9) 

The cooperative weights of these subfield sections may not match the 
political science program at every college, but that is not the Inten- 
tion. Nor will they likely match the weights of the progrems taken by 
individual students. To be sure, a student who took 12 semester courses 
in political science, of which 8 were in comparative political systems 
and International relations, will be at a disadvantage on this examina- 
tion. The test does not cater to undergraduate eub-field specialists. 

16 



page 11 



Iftiat It reflects instead Is an Ideal balance of the undergraduate 
curricula offered. If the Conmittee of Examiners— all of whom are 
political scientists appointed with the advice of the American Political 
Science Associationr-represents the field well, then we have to appeal 
to their authority. And if they establish a universe of questions^ from 
the courses of study most commonly offered" in undergraduate political 
. science pxograma^-^then-it-etands to-rcason that the test is reflective. 

A more technical term for what is at issue here is "content representa- 
tiveness," i.e. the degree to which the examinations "draw upon content 
that is basic to and most important for success in graduate school. 
(Oltman, 1982) When a study of the "content representativeness of a 
Graduate Record Subject Ares te^t is undertaken, faculty are asked to 
Judge an existing version of the test against: (a) what is actually 
taught in their own departments, and (b) an "ideal" undergraduate 
curriculum in a discipline. The results of such a study on the 
Chemistry, Computer Science and Education tests (see Table G) 
demonstrate that the weighting of the tests ("Committee Specifications ) 
was somewhat different from tHe Judgments of teaching faculty. Whether 
that should be the case m whether the faculty should be wiser than the 
Conmittee, the very question indicates a reflective purpose in the 
tests. As W. Ann Reynolds, a member of the Committee of Examiners for 
the GRE Biology test in the early and mid-1970s reflected, it was 
always very clear in our minds that we were writing questions to 
ascertain how well students had mastered various aspects of the 
biological sciences." (10) 

Perhaps the whole issne is a ctircumlocution of logic; and I would not 
dwell on it if it had not been raised by others. Of course we expect 
graduate students to have mastered appropriate subject matter and habits 
of mind necessary foi studying a discipline or profession at an advanced 
level. Studies of the predictive validity of these tests tell us how 
well the " measures >forlt, no^ Ttow wm -the generat-teat-talcing population 
has learned in college. Such studies look only at those test-takers who 
actually enter and complete at least one year of graduate or profession- 
al school; and our understanding of the reflective qualities of the 
examinations ia limited if we look only at that group. As Wigdor ^^^^ 
Garner observe: . .. — ~" 

"When the accepted group is a select subset of the applicant 
group, the correlation with the criterion is lower that It 
would be for all applicants. In extreme cases the reasons 
for a lower correlation in a selected group are easy to see. 
In a basketball league that was limited to people with heights 
of 5*10" and 5*11", height would not be expected to be a good 
predictor of a player's average number of rebounds per game. 
If player heights vary greatly, however, the correlation 
between height and average number of rebounds would be 
expected to be higher." (11) 

Even so, as Anne Anastasi has observed, all these tests measure the 
"current status" of student learning and development, regardless of 
"whether their purpose is terminal assessment or prediction, (12; ana 
"current status" is influenced by a student's past course-taking 
patterns. As Nancy Burton of ETS points out, a score on the GRE Subject 



ERIC 



page 12 



Area Test In Economics nay be "^iffected more by how recently the student 
took mathematics than by the Economics the student studied « Therefore," 
she concludes, "the test might be excellent for selection, but poor as a 
measure of undergraduate Economics curricula*" (13) This Is a viable 
hypothesis about currlcular co-requlsltes, one which could be 
:(nve6tlgated through an analysis of the college transcripts of 
test-takers (see p. 38 below); but It still does not change the 
conclusion, as the Economics curriculum has become more quantitative In 
orientation due to changes In the professional practice of Economics. 
The Committee of Examiners itself acknowledges that evolution In Its 
description of the GRE Economics test (14), and one suspects that the 
vast majority of undergraduate Economics programs now require at least 
Economic Statistics If not other mathematics and quantitative Economics 
courses. So we would expect the examination to measure the ^'currert 
istatus" of student learning In those programs. 

Reinforced by such common-senslcal distinctions, the lay public will 
reason that the GRE Subject Area Tests are the only nationally validated 
measures to assess the "current status" of undergraduate learning In the 
disciplines (and their co-requlsltes) and that the LSAT, GR£/6eneral 
Examinations (Verbal and Quantitative), 6MAT and portions of the MCAT 
examinations are the only nationally validated measures of student 
competence in analysis, problem-soiying, and verbal facility at the 
college level. Therefore, when over a half-million U.S. citizens take 
these tests annually^ we are Justified in looking in changes in scaled 
test scores as reflective of the quality of college student learning. 
Whether these particular tests are the best of all possible measures or 
whether the quality of test performance predicts performance In graduate 
school, professional school, or subsequent career Is beside the point In 
the symbolic environment of public Interpretation. (15) 

8. How Should We Measure Change In Scaled Test Scores? 

For reasons of easy public reference, we would like to be able to make 
statements such as "The mean SAT scores have decJJLned J)X^X.^^ ^ent^ver 
the past 10 yearsji^'l ^_tha±.--??Ihe -mean-ii^ has risen by X per cent 

-over-^the'past 10 years." 

The Point Approach 

But the testing services never make such statements. Instead, they say 
that the SAT has declined by so many points or the LSAT has risen by so 
many points. Let us adopt a piece of the terminology from testing to 
discuss this Issue and ask whether these statements-^phrased In terms of 
"points" — have "face validity." That Is, Is what they are saying 
obvious? 

If you observed that the temperature fell 10 degrees last night, the 
meaning of your statement Is very different depending on the scale you 
are using. Fahrenheit? Celclus? Reaumur? For a farmer or a chemist 
(In the case o;£ the Reaumur scale), the difference Is critical. 

All the tests we are talking about involve raw scores which are then 
statistically translated onto a scale. If all the tests had the same 



18 



page 13 



number of questions and the same scales/ a statement such as "the mean 
GRE/Verbal score declined 61 points between 1964 and 1982 while the mean 
score on> the GRE Area Test In Sociology declined by 113 points" might 
provide some reliable clues about comparative academic performance on 
these two measures over the same period of time. 

But the scales for each one of these tests are different — as are the 
number of questions (and" the fewer the number of questions, the more 
volatile the scores). Most of this data Is presented for all 24 tests 
and sub-tests under examination In Table C. An excerpt may help to 
underscore the point: 

(1) 
Maximum 
Reported 
Score 



Test 



- (2) (i),. 

Minimum NumberN>f 

Theoret; Points o^\ 

and Scale 

Reported (ll - #2) 
Score 



GRE/Verbal 
Chemistry 
Physics 
Economics 
Sociology 
English Lit. 

*Wa8 900 from 1952-1980. Reduced to 800 In 1981. 



850* 


210 


640 


990 


440 


550 


990 


370 


620 


990 


400 


590 


990 


210 


780 


810 


250 


' 560 



It Is obvious that each scale Is different, and that a 61 point decline 
on the GRE/V scale (210 to 850 — or a 640 point scale) wouldprobably not 
be as severe as^ 6jLpoJjit-^decIitie-on tlie 6RE Chemistry test scale 
-(440-990-^-6r a 550 point scaled .-— ^ 

Given the differ anc^ In scales, a statement about changes in mean scores 
that is phrased in terms of points does not have face validity. 

Arithmetic Approaches ^ 

The attempt to make statements about percentage declines or Increases in 
mean scores of tests (and I have made my share of those statements over 
the past few y^ari) Involves simple, arithmetic approaches. Let us use 
the GRE/Verbal scures for 1964 and 1982 (the extremes of the period with 
which this paper deals) to Illustrate these approaches: 

1964-5 1981-2 
Mean 530 469 -61 points 

Stand. Deviation 124 130 

The decline in the mean score between these two historical benchmarks 
was 61 points (we will deal with the Standard Deviation in a moment). 
What kind of percentage decline is that? 



19 



page U 



(1) Is the percentage decline 61/S30--or 11. 5Z??? That might nake 
eense If the scale atarted at zero. But the scale does not 
start at zero: it starts at 210 (the average ■inlBun. score on 
seven foms of the GRE used between 1973-1977). 

(2) If we say that 210 ■ 0* then is the percentage change 
61/(530-210)— or alnus 19 IZ?? Thai would appear to make sense 
if the scale was 1000 points in vslue, i.e. 210 to 1210. But 
the scale for the GRE/Verbal Is 210 to 850. Instead of 1000 
point in value. It is 640 points in value. 

(3) If the scale for the GRE/Verbal Is 640 points in value, each 
point is worth 1/640, or .001562. So, does the percentage 
decline for the GRE/V -^61 x .00156Z— or 9.5Z??? That sounds 
fairly sensible, and perhaps is the most sensible of the 
otherwise misleading avithmetic methods. 

These three arithmetic methods, each referring only to changes in mean 
scores , yield three very different results: (11. 5Z), (19.12) and (9.5Z). 
Most of us would stare at those figures snd conclude that there has to 
be something wrong with the method.. 

There is: we have not described what is really being measured because we 
have not identified the basic point of reference accurately. 

The Geometric Approaches 

If we were dealing with raw scores on these tests, the arithmetic 
approaches might make more sense. But we are looking at STrbltrary. 
scales , 8cale8~wtth~hlst0rtcBl-1«iggage*T-and -benchmarks~( 8c<H?e5>-along — 
those scales that are being "equated" (smoothed out by statistical 
adjustments) from year to year as the number of questions on a given 
-test changes ^ as the dlstributlon-of-those questions -across- fields~and 
tasks is rewelghted, and as the level of difficulty of those questions 
and the mix of difficulty levels shifts. The scale is sacred to the 
testing community; but these scales sre not universsl measures like 
Fahrenheit degrees. When one changes the scale itself (ss the LSAT did 
in 1982 and as the MCAT did in 1977 to accompany the change in the 
entire character and construction of that examination), we lose one 
history and must begin another one. 

The measurement of chsnge on a scale demands attention to some very 
basic statistical constructs, the most basic of which is the Standard 
Deviation. The Standard Deviation is a measure of variance— or 
geometric dispersion'-- of scores that accounts for the sbllltles of 
students who tske an examination. Ve Ignore it in this regsrd at our 
peril — and we ignore it to the detriment of students. Ideally, the 
Standard Deviation should tell us the range of scores around the mean 
within which roughly 2/3rds of the cases on a given test in a given 
year, fall. That may not be a technically eloquent way of phrasing 
it— but it will have to do. . 



* The GREs, for exaa^le, were standardized in 1952. 

20 



page 15 



The Standard Deviation thua provldea us vlth Interpretive guidelines for 
the scores of a majority of students. When the change in scores over 
tine exceeds the range of expectations inherent in the Standard 
Deviation, then — depending on thtt direction of change— we ought to be 
worried ught to 6e pleased. Again, that is not a technically 

accura*: ^ag of tfhe issue, but it*6 going to have to do. 

Changes ii» oean scores over time, then, can be messured as a fraction of 
the Standard Deviation In the base year. When these changes approach or 
exceed one full Standard Deviation (1.00), then we are observing very 
significant change. That is a theoretical benchmark which, as we will 
shortly see, does not always fit the circumstances. 

There ari two ways of calculating this fraction: 

(1) If we use the Standard Deviation for the base year (196A) only, 
then we get a computation for the GRE/Verbal that looks like 
this: • 

-61 (change in scale points) • -.49 

12A (Standard Deviation in 1964) 

Translation: over an 18 period, the change in the mean score of 
the 6RE/V was roughly one-half a Standard Deviation. The 
direction of change was obviously negative. 

(2) There is a slight problem with Geometric Method #1. It is 
really quite minor, but, in the interests of technical 
accuracy, ought to be addressed. The Standard Deviation takes 

—account -of~real-gToups-of-studcntS-whQ_take. .the tests and 

their abilities. Obviously, a different group of students 

takes „the teat s_each. year. When_we compare jtcores frra 1964 to 

those of 1982— or to any year in between— w* are not looking at 
the same group of students. 

So it is a technical mistake to base our calculations on the 
Standard Deviation in the base year. Certainly, over an 18 year 
period, we would distort or misrepresent the composition of the 
student groups taking these tests by so doing. 

What do we do instead? There is really no way to render the 
groups of test-takers over 18 years equivalent to each other so 
that our comparisons of performance can be scientifically 
accurate. In the absence of statistical guidelines, let us try 
something that at least sounds common sensical: let us use the 
mean Standard Deviation for the 18 years in question. Actually 
using the mean S.D. (as opposed 'to the base year S.D.) does not 
change the results that much. 

For the GRE/Verbal, the computation would thus be: 

-61 (change in scaled points) ■ -.48 
128 (mean S.D. , 1964-1982) 



21 



page 16 



For better or for worse, the last of our methods (Geometric #2) will be 
used In ^thls paper, except when we are looking at fewer than 10 years of 
data_oii_a. particular teat- (and then we will use Geometric #1). 

9e Hov Do We Judge the Magnitude of Change? 

The notion that a change of 1.00 In Standard Deviation Unlfs (S.D.U.) Is 
a touchstone gf significance Is an arbitrary one. One has to develop a 
sense of degrees In the context of specific data and make some Judgment 
C4lls. In Investment In Learning , Howard Bowen provides some guidelines 
for describing "estimated average changes In cognitive learning 
resulting from college education" vhen one measures In terms of Standard 
Deviation Units (16), and I borrow from h|m here. 

Bowen applies his descriptors to change In the learning of the same 
group of students over a period of four years and uses a total of six 
gradations. Our figures cover 7-18 years of the performance of 
different students. As our variations will thus be greater, we need 
more descriptive categories, and I have chosen the following: 

Estimated Change as 

Expressed In S,D,U.s Descriptive Judgement 

+•75 or above 
+.40 - .74 
+.20 - .39 
+.10 - .19 
(.09)- +.09 
(.10)- (.19) 
(.20)- (.39) 
C.46)- (.74) 
(.75) or below 

10. Changes In Performance, 1964-1982 



Extreme Increase , 
Large Increase 
Moderate Increase 
Small Increase 
NO CHANGE 
Small decline 
Moderate decline 
Large decline 
Extreme decline 



Let us now turn to the bottom line as presented In Tables C and D. 
There are 24 tests and sub-tests at Issue (Including the division of the 
LSAT data Into two periods). Of the 24, we have 13-18 years of data on 
18, and 6-8 years of data on 6 (all of which are either sub-tests of the 
MCAT or the two periods of the LSAT). Let us make our task easier for a 
moment, look only at the long-term measures, and combine the two periods 
of the LSAT. Then, let us distinguish between tests with large numbers 
of test takers and those with comparatively small numbers of examinees. 
The former consist of the tests of "general learned abilities,** the 
GRE/V, the GRE/Q, the GMAT and the LSAT. The latter consist of 15 GRE 
Subject Area tests. In very gross terms, here's what happened: 

High Volume Low Volume 

Tests Tests 
(GRE, ^T, LSAT) (GRE Sub j . Area) 

Advances 1 2 

No Change § - 1 2 

Declines 2 11 



^JV 22 



page 17 



Now* If that were a summary of stock market action for 9 day* a week, or 
a year, one would be hard--pres8eAjO- call- it ^uU i 

Remember that these are trends in separate measures , and that most of 
the students who take the GRE Subject Area tests (low volume) also take 
the GRS/V & Q (high volume). Ve are not counting the same students 
twice, rather are distinguishing between two types of performance, 
general and subject-specific, and are using the trends as indicators. 
While the general performances may be mqre important simply by weight of 
numbers, no matter which type of performance we examine, the 18-year 
trend is the same— down. At the same time, as we will see, shorter term 
measures (the 6-year trend) evidence a more neutral pattern. 

Staying with the surface data for a moment, we should ask whether the 
changes in scores are significant. Mote (in Table D) that six of the 15 
test scores (longer term or short term) that declined evidence "large" 
or "extreme" declines, but that none of the four test scores that 
Increased did so by a large amount. So our intuitive perception of 
significant decline is probably Justified. 

We know that not everyone who takes the tests attends graduate or 
professional schools; and we know that not everyone who attends graduate 
and professional schools has taken the tests. But if the half* million 
people who take these tests come frpm the top 25-30Z of their college 
classes, and if the general decline is as severe as it appears, then 
perhaps we ought to start looking more closely at the quality of what is 
taught and learned in American higher education (at least lii the fields 
in which the declines are most significant) . 

_No doubt an objection Willie— raiBednrhese" declines, it will be 
argued — even among the top 25-30% of our college graduates— simply 
reflect declines in the preparation of of entering college students, and 
what we really should do is to control the results for student ability 
at the point of college entrance, or, at the least, compare these trends 
to those on the SAT and College Board Achievement tests. 

These objections should be answered in reverse order: in a secondary 
analysis of aggregate data, one cannot compare trends in the test scores 
of college graduates, to those in the test scores of high school 
graduates. As I pointed out previously* the College Board Achievement 
tests are not good historical guides because the scores are periodically 
rescaled. As for the SAT and the ACT as compared to the GRE, LSAT, 
GMAT, and MCATs, we have a classic case of apples and pears. These are 
different examinations being taken by very different populations. 

But the SATs and GREs are the most analogous of these fruits, and invite 
comparison as indicators. Performance on the SAT/Verbal fell by -.41 of 
a Standard Deviation Unit during the period 1960-1978 (the period under 
investigation minus four years — to account for the traditional gap 
between high school and college graduation), while performance on the 
GRE fell -.48 of an S.D.U. Likewise, performance on the SAT/Math fell 
-.23 during that period while performance on the GRE/Q did not change at 
all. But as. soon as we remove foreign students from the GRE equation, 
the results change (see p. 27 below), and the difference between SAT and 
GRE Indicators narrows considerably, to wit: 

. 23 



page 18 



Change In Performancfe for U.S. Cltlaens 
~ (in Standard Deviation Units) 



SAT 
(1960-1978) 



6RE 
(196A-1982) 



Verbal -.41 
Quantitative -.23 



-.38 



-.15 



Is the 6RG population here analogous to the SAT population? No. It is 
not as homogeneous a group in terms of age, and most will concede that 
it is a group o£ higher relative ability. 

It is more Important to control for ability in primary statistical 
analyses of 6RE test scores by reference to earlier scores ca the SATs, 
but the research literature is surprisingly thin on this issue. (17) 
But it is equally important not to transform statistical controls for 
ability Into ah "input approach" to measuring student achievement, for 
the "Input approach" is of no help in determining what influences 
students between matriculation and graduation. To determine student 
learning during the college yeajs— at least on the more general of these 
measures-^e might give the GRGs, LSATs, and GMATs to college freshmen, 
and then again, to college seniors (on the other hand, one could not 
prove anything by following that procedure with the 6RE Subject Area 
Tests). Or, as Tumbull suggests, we can aggregate existing information 
on student performance to make a similar assessment. (18) 

11. Three Periods of Change. 1964-1982 

Even if all we looked at were mean scores. Standard Uj^eviations and 
numbers of test-takers, the case of decllnfe is not as simple as 
Indicated in Tables C and D. One of the striking characteristics of 
this data is that there appear to be three distinct periods of change 
within the 18 year span under investigation: 

Period I: 1964-1970. This is a period of sharp declines in scores — at 
least for the 17 examinations on vhich we hav^ full data. 

Period II: 1970-1976. This is a period of reversal and stabilization, 
i.e. one during which either declining trends reversed direction or 
"based out," reaching a" low plateau. 

Period III: 1976-1982. This is a period of stability and/or modest 
decline. After "basing out" or reversing trend, the mean scores 
either held steady or resumed their general direction of change 
at less dramatic rates than In Period I. 

Table E is intended to illustrate this phenomenon for the 6RE examina- 
tions (Verbal, Quantitative, and Subject Area tests), and should be used 
in conjunction with Table A. Table E points to the specific years in 
which change in direction of the test scores occurred, using the 
criterion of ± .10 or more change in Standard Deviation Units from base 
years (and, subsequently, from "turning point years"). In examining 
Table E, the reader will note that not all the tests demonstrate trends 
corresponding to the three periods. The 6RE/Verbal and Quantitative 
examinations, for example, evidence only two periods, i.e. one "turning 



ERIC 




page 19 



point," as do the 6RE Subject Area teste In Engineering, History, 
Psychology and French. But even In those cases, the turning point 
occurs sometime during Period IZ, 1970-1976. 

Was there anything that happened to the tests themselves, the methods o£° 
administration of tests, or the scoring of the tests during the 
1970-1976 period that might account for these changes in trends? 

Indeed, the answer to all three questions Is' "yes." The Graduate Record 
Examination Board was established in 1966, and, under Its direction, a 
number of alterations in both the tests and methods of administration 
began to appear. ETS carried out ^a complete review- of the GRE Subject 
Area tests In 1970-1972 that resulted In changes In the content and 
weighting of the different parts of the examinations. A number of 
content validity studies were first conducted; then the Committee of 
Examiners for each test recommended changes designed to reflect the way 
In which different subjects were actually being taught In our colleges. 

Biology, a discipline that underwent a major revolution In the 1960s, Is 
a classic example. The figures In Table A demonstrate that the mean 
score of the GRE/Blology Area test declined modestly from 1964 through 
1969-71, when It "bottomed outy" then reversed trend, rising until 
1975-76. In 1971-72, the test was rewelghted Internally to reflect, 
e.g. the growing emphasis on cellular and molecular Biology Inmost 
college curricula; and consequently, the norms had to be recalculated. 
The following year (1972-73), the first year of th<^ new weightings, the 
mean score Jumped ~13 points (a change In Standard Deviation Units of 
■^,\\ — which, for a one year change, appears rather significant), while 
the Standard Deviation Itself fell from 115 to 110. Both changes 
reflect a greater agreement between the preparation of undergraduate 
Biology majors and the construction of the examination. 

Do the effects of these reweightlngs and other internal content adjust- 
ments influence our overall Judgment on the general decline in test 
scores of college graduates over the period in question? To a certain 
extent, yes; though it is impossible to quantify the Judgment and though 
the Judgment applies principally to the Subject Area tests of the GREs. 
One might argue that if the content and weighting of the exAmlnatlons 
changed in the early i970s to reflect better what was actually being 
taught in our colleges and universities, then the earlier declines 
(Period I) may have been exaggerated. Nonetheless, after the period of 
trend reversal, most of the scores resumed their declines — though, as we 
have noted, at a more modest rate. And on6 should add that the trends 
in mean scores of the general learned ability tests (GRE/Verbal and 
Quantitative, LSAT, ^nd GMAT) were wholly unaffected. 

The second possible exjplanation of trend reversal in Period II has to do 
wlth'^methods of test administration and score reporting. In 1969, local 
administrations of the GREs (General and Subject Area) were eliminated, . 
i.e. if you wanted to take the test you had to do it at a national 
administration where your presence and scores would be included in the 
nationally reported data. Or, if your college wanted all Psychology 
majors to take the GRE Subject Area test for purposes of program evalua- 
tion, those students would have to troop to a test center on a given 
administration day. (19) The coincld^ce between the period of these 



X 



adjustnents uxd the stabilising or raversal of trends In 6RE nean scores 
Is rather striking) and Is backed up by changes in the Standard Devia- 
tions. An Increase In the numbers of test- takers reported In the 
national data Is also notlceab}^ during this period, In part as a result 
of changes In rules vls->a*vl8 thi^se of the tests In local assessments. 

How nuch did this shift in uethod of test administration and score 
.reporting affect the trends we observe? Based on the data available > I 
caii^t tell; but let us apeculate. For convenience, let us call those 
who voluntarily took the GREs (General or Subject Area) as part of 
institutional evaluations the "experimental grotfp." The 6RE annual 
reports (GRE Summary ) do not Sj^lit this group out from other 
. teotrtakers , as the questionnaire dpes not ask why an individual is 
takidg a particular tear. 

If you are a volunteer test-taker, your motivation to perform well is 
less than that of someone who needs the best possible score for graduate 
school admission. If you ate being paid, the compensation is for your 
time, not your performance. So ve can reasonably hypothesize that the 
experimental group will not perform as well on the examination as the 
/'control group" (i.e. the rest of the test-takers). At the same time, 
though, volunteers for such a task tend to be among the better students. 
The net effect would probably be a wash. 

» 

Even so, how large would this "experimental group" be, and how much 
would its performance affect mean scores and^tandard Deviations? Once 
again, we have to make some speculative inferences. Between the 
1968-1969 testing year and the 1969-1970 testing year, there was a surge 
across the board in the number of test-takers on the GRE Subject Area 
examinations, a surge that is rather noticeable when set against both, 
preceding and subsequent yeats (see Table I). Increases o^ more than 
20% in the number of test-takers in 1969-70 can be observed in History, 
Biology, Geology, Engineering, Economics, Sociology, Psychology, 
Education, and Music. 

While many factors were responsible for the overall increase in the 
number of test-takers in the late 1960s'' and early 1970s, the relative 
increase in this particular year was probably due to the change in the 
administration and score reporting of tests taken for -purposes of local 
program assessment. 

The impact on scores, however, is more difficult to Judge, According to 
our hypothesis that volunteer test-takers (even though they tend to be 
the better students) do not perform as well as those who are in the game 
for competitive purposes, the mean scores should go down and the 
Standard Deviations should rise with the infusion of any aigniflcant 
number of this group into the overall population of test-takers. On the 
surface, that» appears to be what happened. Of 15 Subject Area tests, 13 
mean scores declined between 1968-69 and 1969-1970, and 10 Standard 
Deviations rose (though three of those rose by a minimal amouart). But 
the changes in mean scores do not differ from those of the previous 
years — at least in general direction. Thi's, one is not wholly sure of 
the impact of the "experimental group" on score trends. 



page :iO 



26 



12. OthT Ittttraiil and Ad«iniitr«tivt laflutttcts 

• 

Chtnf in Wifbtf of QuMtioM. Th« £«irtr th« mnnber of quest io&s 
on an oxiBlnatlon* tho groattr tba.voUtility of seorM. Table C 
indieatea the approslaate auabar of quaatlona on the dlffarent azamlna- 
tiona. The reader vill note that the 61£ Subjec*: Area taata in 
Kathematiea' and Physica (which aaphaeise problem-aolviag) • have fever 
nunbara of queations than other GRE Subject Area taata» and are the only 
two GRE teats to evidence incraaaea in Man acorea over the 18 year 
period. Iht nunbar of quaationa ahould not aake a difference with the 
respect to the direction of change, but the relationship is certainly 
noticeable here. The relationahip can only be explained vith reference 
to the quality of atudenta who major in ■athenatica and physics: they 
tend to out-perfom nost of their peers on teats of general learned 
ability (GRE/V&Q, LSAI, GHAT) aa wall (aee Table ?r 13). 

We ahould also note that the nuiiber of queations on all exaainatibna 
other than the G^ Subject Area Teats say vary alightly from year to 
year, and that not^all of those questions count in detemining raw and 
acaled acorea. Kany queations are inaerted on a trial baais. From the 
results. fTS dateminas the level of difficulty, content validity and 
other atatiatieal propertiea of the queationa. hencr their 
appropriateneaa for.induaion in the universe of quoi^tiona from which 
the different versiona of the teat can draw. Thia variance in number of 
questions doea not affect teat reaulta. 

• 

Scoring Methods . Nor should a recent change in the method of 
determining raw acorea. Up to 1980. the raw acore waa determined by 
adding the nusd)er of questions anawered correctly and aubtracting one 
fraction of the incorrect anawers and another fraction of queations the 
test-taker did not anawer at all. Starting in 1981 for aome teats (GRE 
General) and 1982 for othera (e.g. the LSAT). the method waa changed so 
that the student!) vaa not directly penalised for queationa he/she did not 
answer. And one can infer from Grandy*a conaenta that the GRE/General 
Examinations now use a "righta only" method of acoring that does not 
even penalise the atudant for gueaaing wrong anawera. i20\. The method 
of determining raw acorea ahould not changa acaled acorea. aince there 
will be appropriate atatiatieal adjuatmenta to inaure comparability and 
continuity. However, perhaps five years from now. one could subject the 
data to rigoroua atatiatieal analyaia to determine iihether that 
hypotheaia holda. 

Diacloaure . Laatly. one might aak whether tha recent "truth-in- 
test ing*^~practiceB have affected theae teat acorea. Over the period 
1979-1981. the governing boarda for the GRE. LSAT and GMAT made both the 
teats and answer shaeta available to atudente following each test 
administration. Some might aaaume that the mora atudente who take 
advantage of diacloaure. the better prepared f or .^ubaequant axaminatioi^a 
the teat-taking population will be. and therefore mean acorea ahould 
riae while Standard Daviationa ahould remain ataady or fall. 

Given theTcecency of thia development, we have very little information 
on which to judge the affecta of diadoaurci^ op the quality, reaulte. and 
uaea of the taata. . In^^fact. practice teata iwi: the GRE Subject Area 



page 22 



•xamlnatlons vere not available until 1982-83, and therefore fall 
outside this analysis. 

The only examination on which test disclosure practice might have made a 
difference is the LSAT, partly because, as Wlgdor and Gamer observe, 
"LSAT scores receive more weight in decisions on laws school admissions 
than do acores on tests for other programs" (21). Knowing that is the 
case, we would reason tfjtat more prospective law students would seek to 
take advantage of disclosure as soon as it was possible, and the results 
would be reflected in th9 data. Indeed, according to Bruce Zlmmer, then 
Executive Director of the Law School Admissions Council, 55Z of "recent" 
(1981) examinees took advantage of the disclosure option. (22) And 
circumstantial evidence does suggest a relationship between test 
performance and disclosure: the mean score on the LSAT Jumped 14 points 
and *,13 of a Standard Deviation Unit in two years (1979/80 to 1981/82). 
But until we conduct some research, we are never going to know for sure 
whether this otherwise tantalising hypothesis can be validated. 

13. ExplalnlnR the Changes, Round 1; Numbers of Test-Takers 

Our judgment on the significance of the overall changes in scores 
depends on the numbers and characteristics of test-takers, and here 
there are some confusing trends. 

Kow, the numbers of test-takers have changed rather dramatically in the 
period under consideration (1965-1982), and these changes are often 
attributed to external forces such as the Vietnam War and shifts in the 
labor market. Regardless of external cause (that analysis lies beyond 
the scope of this paper, and could not be conducted on the basis of 
available data gathered from examinees) , the conventional wisdom says 
that there is an inverse relationship between the numbers of test takers 
and mean scores. That is: 

The greater the number of test takers, the lower the scores; 
The lesser the number of test takers, the higher the scores. 

That is supposed to be the way it goes— in very, very bald and simple 
terms. Of ccu'ise I have purposefully simplified the relationships. For 
the ratios to work, many other variables must remain constant, e.g. 
scalos of tests, weightings of sub-test raw scores, etc. Nonetheless, 
I'd like to keep the analysis very simple for the moment. 

Is that the way it actually went from 1964-19827 Let's look at some 
of the scores and numbers here to see what happened. 

A. Graduate Record General Exams; Verbal and Quantitative 

o The total number of test taksrs has more than doubled since 1963, 
but has been falling steadily since 1976. 

o The mean score on the. GRE-Verbal fell steadily through th« whole 
period, as if it were impervious to the number of test takers. 

o On the other hand, the mean score on the 6RE-Q fell with the 
rising number of test takers through 1976, then rose with the 
falling number of test takers through 1982— Just as conventional 
wisdom holds. ^ 




page 23 



B. Graduate Hanagenent AdnlBSlong Test 

o The total number of teat takera haa quadrupled alnce 1965 
(It nay have peaked in 1980-1981, but It 'a too early to tell). 

o The mean score followed the paradigm illustrated in Table E, 
dropping rapidly from 1965-1971, then levelling off, 
even aa the numbers of test takera continued to rise. By the 
atrict constructlonalist reading of conventional visdovi, though, 
the scores should fall with riaing numbers. 

C. Law School Admlasions Test . Here we have two different eets of data, 
from which we can nonetheless conclude that: 

o The total number of test takers tripled between 1965 and 1974, 

but has declined about 20Z aince then, 
o No matter which set of data one uses, the mean scores on the 

LSATs rose through the whole period, as if they were impervious 

to the number of test takers. 

*D. Graduate Record Achievement Tests i English and History 

o The number of test takers rose dramatically duiing the period 
1964-1970, but has fallen precipitously aince then. 

o Mean scores fell with rising numbers (Just as conventional wisdom 
would have it), but continued to fall along with the number of 
test takers, thus running against conventional wisdom. 

*E. Graduate Record Achievement Testi Biology 

o The number of test-takers rose considerably from 1965 to 1978, 
and has since reversed field, dropping 322 to 1982. 

o Scores first fell, then rose with increasing numbers 

(particularly from 1970-76) , then fell again with decreasing 
numbers. Only the first of these. three movements proceeds 
according to conventional wiadom. 

F. Graduate Record Achievement Test; Mathematics 

o The number of test-takers rose from 1965-1970, but has been 

falling ever since, 
o Mean scores fell with the rising numbers and rose with the 

falling numbers— just aa they are aupposed to do. 

*G. Graduate Record Achievement Test: Sociology 

o The number of test-takers rose I50Z from 1964-1968, rose another 
661 in the next two years, then fell all the way back to its 1964 
level by 1982. 

o Scorea remained atable from 1964-1968, then began falling— first 
(and briefly) with rising numbers of examinees, and then with 
falling numbers. Only the second of these three movements 
proceeds according to conventional wisdom. 

In the nine cases I have uaed for illustrative purposes, there are many 
instances in which the relationship between acore trends and number of 
examinees does not follow conventional wisdom. The frequency of these 

*See Table M 



page 24 



"counter-Intuitive" situations is such us to dampen one's enthusiasm for 
an analysis of trends based on the single variable of volume. But there 
is another reason not to pursue this line of analysis: at the graduate 
school level) it is not accurate. Conventional wisdom may work when one 
ifl talking about moving from one general level of education to another, 
such as high school to college— as in the case whenever a product or 
service becomes both attractive and accessible to a previously marginal 
population of consumers. But, as Tumbull writes, "where you are con- 
sidering shifts among fields of students at a particular level— entry 
into one segment rather than another — I would not expect the same logic 
to hold and would look for volume and scores to move up and down 
together at least as often as they moved in opposite directions 
[emphasis added]." (23) 

That is, even when we i|re looking at a population of 550,000, we must 
realise that the students are selecting not only to continue their 
education at a very high level, but also to explore a very particular 
territory. What applies at a more generalized level does not necessarily 
apply here. Consequently, 1 would like to set this issue aside for a 
while, and under the conviction that one cannot explain the changes we 
are witnessing only by reference to the number of test-takers. 

14. Explaining the Changes, Round 2; Age, Race, Gender 

The demographic characteristics of the test-taking population are 
usually regarded as fruitful sources in explaining trends in performance 
on standardized tests. Let us look at three of the most basic of these 
characteristics— age, race, and gender— to see whether they can help 
refine our understanding of what has happened. The basic references are 
Tables A and B, and we are looking at changes only during the years for 
which we have information on the demographic background of the tdst- 
takers— 1975-1982— and only at the tests of general learned ability 
(6RE, LSAT, GMAT, and the Reading and Quantitative Sub-Tests of the MCAT 
since 1977). 

Let us remind ourselves, first, what happened on each one of those tests 
during that period: 

Pt. Change S.D.U. Change 



GRE: Verbal 


492 


to 


469 




(23) 


(.18) 


GMAT: Verbal 


25.9 


to 


26.5 




0.6 


.06 


MCAT: Reading 


7.98 


to 


7.74 




(.24) 


(.10) 


LSAT 


530 


to 


553 




23 


.21 


GRE: Quant 


510 


to 


533 




23 


.17 


GMAT: Quant 


27.0 


to 


27.4 




0.4 


.05 


MCAT: Quant. 


7.99 


to 


7.45 




(.54) 


(.21) 



There are obviously some distinctly different trends here. Do the basic 
demographic variables explain them? 

Age . The college population is growing older and It Is hence not 
surprising that the population of those who take these tests is growing 
older. As evident in Table B, the percentage of those in the 
traditional 19-24 age group fell significantly on both the GREs and 

30 



page 25 



LSAT8, whllt remaining stable for the QSAATb, The MCATs do not report 
data by age, but we can raaaon that alnce the percentage of college ^ 
graduates taking the MCATs has remained stable during a period In which 
the average age of college graduates rose (see Table B)» the MCAT 
test-taking population Is also a bit older* - 

Should older students perform better on the 6RE-Q and worse on the 
6R£-y, for example? I certainly would not expect so. After all, 
mathematics is a wholly school-learned subject, and the further away in 
time one gets from the active study of mathematics, the less likely one 
would perform well on a test of general quantitative ability. At the 
same time, verbal facility should Increase with age— and the reading, 
writing, work and social experience (hence, language use) that comes 
with age. 

e 

The MCAT: Reading sub-test trend reinforces that of the 6RF-V; but the 
MCAT: Quantitative trend moves in precisely the opposite direction from 
the GRE-Q. The. explanation is fairly simple. The GRE-Q draws largely 
on secondary school mathematics; but the MCAT: Quantitative also draws 
on college-level mathematics, and is more of an achievement test. 
Pre-Medical-.studentSf ~one «an reasonably speculate, pay more attention 
to their preparation in scientific subjects than they do to mathematics, 
and the trend in scores on the MCAT sub-tests in Biology and Chemistry 
supports that apeculatlon (besides, some 77% of the MCAT test-takers 
major in either the Biological or Physical sciences) . 

Age does not seem to be an issue with respect to the test-taking popu- 
lation for the OfATs. But what do we say about the LSAT in this regard? 
The LSAT is an examination that relies heavily on verbal skills, and the 
trend in LSAT scores moved precisely in the opposite direction from that 
for the 6RE-V during the period under consideration. Age alone cannot 
account for that difference. 

Thus, the first demographic variable in which there has been a signifi- 
cant change over the eeven year period in question is not much help in 
explaining the diiferencea in test score trends. 

Racial/Ethnic Characteristics . The racial/ethnic distribution of 
U.S. citizen test-takers clianged but fractionally during the 1975-1982 
period, and not enough to affect mean scores. The percentage of whites 
taking the GREs, for example, declined from 87. AZ in 1975 to 86.1Z in 
1982— hardly a stsggerlng move. During the same time, the percentage of 
Asian-American test takers increased from 1.3Z to 1.9Z. While Asian- 
Americans tend to score higher on the GRE-Q than other groups, that 
percentage increase would not account for a 'I-.17 change in Standard 
Deviation Units across 256,000 test-tskers. 

As for the performance of other minority groups on these tests, there is 
no single trend. For example, looking at the performance of 
Afro-Americans and Mexican-Americans compared with that of all U.S. 
citizens on the LSATs and GMATs, we find: 



ERIC 



31 



page 26 



Change in Standard Dgvlatloa Units 



Afro- 
Anericans 



Mexlcan- 
Aaericans 



All U.S. ^ 
Citizens 



LSAT, 1975/76 to 1981/82 
GHAT, 1977/78 to 1981/82 



•••.16 



♦.09* 
+.10 



+ .21 
+ .09 



*Approxlnate, since reporting categories for Rispanics 
changed in 1979/80. . 



One could continue with similar examples of variability—greater or 
lesser change for specific minority groups on different measures of 
general learned abilities. Whatever separate analyses in which one 
might wish to engage on minority participation and performance on these 
examinations, the point here Is that this second set of classic 
demographic variables is not much help in explaining the overall trends 
in test scores evident in the reported data. 

Gender . The percentage of. Jwomen test-takers rose dramatically over 
the seven year period: from 29% to 39Z on the LSAT and from 272 to 37Z 
on the MCATs, for example. Would this trend (also implied by data on 
the GREs and GMATs) explain the recent trends in test scores? 

If we accept steteotypes, women are supposed to perform above average on. 
the verbal sections of examinations and below average on the quantita- 
tive. The trends, however, suggest precisely the reverse. In fact, ' 
fragmentary data from 1976-79 to 1981-82 evidence declining 6RE-V scores 
aiid advancing 6RE-Q scores for women. But as the same, trends hold for 
men, gender cannot be a critical variable here. On the OIAT verbal and 
quantitative sub-test scores for the years 1977-1982, women's 
performance basically was unchanged (+.03 S.D.U. for the GMAT/Verbal and 
+.02 S.D.U. for the GMAT/Quantitative) . It la hard to attribute 
stability in academic performance to gender. 

What may be happening, though, as women raise their educational aspira- 
tions, is that more women of average ability are taking these examina- 
tions, or, at the least, that there is greater variance in the abilities 
of female examinees. As Turnbull*s marginalia on an early draft of this 
paper noted, "here. . .the conventional wisdom [concerning numbers of 
test-takers] applies." If that were true, though, the Standard 
Deviations for women would have risen. Unfortunately, the public data 
on this issue are fragmentary. Nonetheless, even the fragmentary data 
suggest that precisely the opposite has occurred. The Standard 
Deviations for women on the LSATs have remained fairly stable since 
1975-76; those on the GMATs have fallen dramatically since 1977-78; and 
those on the GRE/V have declined slightly since 1978-79^. Only on the 
GRE/Q has the Standard Deviation for women risen. 

15. Explaining the Changes, Round 3; Citisenship and Native Language 

One of the most noticeable characteristics of GRE and QiAT test-taking 
populations is a significant percentage of non-U. S. cititeus (as 



32 



page 27 



previously noted* neither the LSAT nor the MCAT reports Identify 
citizenship or native language*). For the GMATs. that proportion has 
hovered consistently around 20Z since 1977 (data for earlier years are 
unavailable). For the GREs* the proportion of non-U. S. citlsens rose 
fron 7.5Z In 1975/76 to 13 .31 In 1981/82. This trend becomes nore 
significant in light of a parallel increase In the percentage of GRE 
test-tskers who say that English is not their native language: from 6. OX 
in 1975/76 to 10. 2Z in 1981/82 (among U.S. dtlsens* howeyer/ the 
percentage of those who felt more comfortable in a language other than 
English remained stable and low—about 2%), 

Graduate and professional education in the United States has a worldwide 
reputation for quality. And the systems of higher education in many 
other countries do not provide the same opportunities for advanced study 
as we do (whether the American Ph.D. In some fields Is regarded overseas 
with the same reverence we accord It here Is another question— one that 
lies beyond the domain of this discussion) . , So there is nothing 
surprising about a significant number of foreign nationals applying to 
our graduate and professional schools, and taking the requisite 
qualifying examinations. 

Is this proportion of foreign student test-'takers on the GRE/General . 
examinations and the QfAT sufficient to Influence mean scores and 
Standard Deviations? Yes. Does It explain the changes we observe of 
the 18 year period, 1964-1982, let alone the shorter-term (1977-1972) 
for which we have full information on test-takers? No. 

Nhere there has been a shift In the ratio of non-U. S. citizens, we have 
some clues ss to how those test-tskers will influence scores. * The 
GRE/General examinations offer such a case. And the OIAT data confirm 
the trends evident In the GREs. Table H Illustrates the comparative 
performance of "domestic" and "foreign" students on these examinations 
— the only two for which we can disaggregate scores. Leaving aside the 
way in which the GREs define "domestic" test-takers (the definition is a 
dubious one), and leaving aside the various methods of disaggregation, 
we can clearly observe that foreign nationals drsg down the mean scores 
on the verbal sections of these exsijainatlons and prop up the mean scores 
on the quantitative sections. 

Without foreign nationals id the picture, for example, the GRE/Q would 
have declined by (.15) of a Standard Deviation Unit Instead of remaining 
unchanged between 1964-1982. Without foreign nationals in the picture, 
the GRE/V would have declined by (.38) of an S.D.U. Instead of (.48) 
over the same period. 

But what we do not know, in both cases, is the percentage of those 
foreign nationals who graduated d^rom U.S. colleges and universities 
and/or who were native speakers of English (i.e. held citizenship In 
Canada, the U.K., Ireland, Australia, New Zealand, and nations In both 



* Perhaps with good resson, as only 0.3Z of all law degrees awarded in 
1980-81 and only 0.8Z of all medical degrees — as opposed to 12. 8Z of the 
.Ph.Ds. — went to non-resident aliens. (24) 



33 



page 28 



Africa and the Vest Indies where English was the colonial language, and 
hence the language o£ the schools and colleges). 

To Illustrate one part .of this point: for 1981-82, the GMATs Indicate 
that 212 of the test-takers were non-U. S. citizens and 21Z did not speak 
English ss a native language. But 7.5Z of the test-takers were citizens 
of English-speaking nations other than the U.S. or nations where the 
colonial language was English* What that nay aean for the GMATs Is that 
we can reduce the effects of the performance of foreign nationals on the 
Verbal sub-score by roughly one-third (7.5Z over 21X), If the same 
pattern holds for the GREs (and we do not know because the 6RE does not 
cover this Information), then the effects might be diluted In a parallel 
manner . Of course, this Is all hypothesis x to prove the case one needs' a 
statistical analysis using primary data. 

If conmentators want to claim that the performance of foreign students 
accounts for most of the change In 6RE scores we have witnessed In 
recent years (and. Indeed, we have heard such claims), they cannot do so 
on the basis of the reported data. In order to support that claim, one 
needs to look at a critical universe of test-takers who neither (1) 
speak English as a native language, (2) reside In the United States* nor 
(3) gra4uated from a U.S. college or university. And yet none of the 
examinations reports performance for this critical group. The GMATs 
present cross-tabulations of scores by native language and by country of 
citizenship (the Graduate Management Admissions Council Is to be 
commended for requesting that Inf ormatlon) , but do not Indicate which of 
those students graduated from U.S. colleges. The GRE Suianary 
distinguishes resident from non-resident aliens in gross numbers, but 
the data In Table H (drawn from other ETS studies) do not make the same 
distinction. One suspects that a resident alien at the time of 
test-taking Is likely to be a graduate or soon-to-be-graduate of a U.S. 
college or university; hence, his/her score sl\ould not be Included among 
the "foreign means." 

I continue to use the QlATs as a gloss on this Issue because, among all 
the tests we are examining, the GMATs provide the most detailed and 
comprehensive data. We can thus use the GMAT data to provide us with 
some Insight as to the Impact of foreign nationals on quantitative 
scores. The best way to elicit this Insight Is to perform an analysis 
by two variables: citizenship and undergraduate major. Table K breaks 
out the performance of U.S. citizens and foreign nationals on the 
1981-82 GtUiT for selected quantitatively-oriented undergraduate majors 
and selected non-quant Itatlvely-orlented undergraduate majors. From 
this data, we can conclude that: 

(a) U.S. citizens and foreign nationals who majored as under- 
graduates In quantitatively-oriented subjects perform 
equally as well on the quantitative aectlons of the GMAT; and 

(b) Foreign nationals who majored In non-quant Itatlvely based 
subjects as undergraduates outperform their U.B. counterparts 
on the quantitative section of the (SiAT. 



34 



\^ page 29 

If the same phenomena hold for the GREe (and we cannot tell because the 
way the data are > reported In GRE Suamary does not allow us to engage In 
this analysis), and if the percentage of foreign nationals taking the 
GREb who najored in non-quant It a tively based subjects has been rising, 
then that might account for part of the rise in th^ GRE-Q scores since 
1975. 

Can we determine whether the trends in performance on the GRE Subject 
Area Tests have been influenced by the participation of foreign 
nationals? Since, to the best of my knowledge, there is no available 
data on the percentage of GRE Subject Area test-takers who are foreign 
nationals, I tried an indirect route to answering the question. While 
the route does not bring us to the answer (indeed, there are strong 
arguments against taking it in the first place), it raises a critical 
Issue and hence is worth recounting. 

The National Research Council reports regularly on Doctoral degree 
recipients from U.S. universities, covering citizenship (resident and 
non-resident) by field (among other variables). (25) Here, for example, 
are the percentages of doctorates that were granted to foreign nationals 
in 1980 in selected fields for which there jare GRE Subject Area Tests. 
The list is arranged in the order of net changes in test scores as 
reported in Table D: 



Percentage of 
Foreign National 
Doctorates, 1980 

Field 



Change in Test 
Scores (in S.D.U.s) 
1964-1982 



Mathematics 27. IZ -(•.28 

Physics 24. A •f.17 

Economics 31.9 -.08 

Chemistry 21.8 -.11 

Engineering 46.3 -.22 

Psychology 3.9 -.26 

Education 8.2 -.28 

History - 6.3 -.70 

English 5.2 -.72 



It certainly is coincidental that the greater the drop in test scores, 
the lower the participation of foreign nationals— at least among doctor- 
ates. But that is Just coincidence. After all, those who received the 
doctorate in 1980 began their graduate studies an average of 9.3 years 
before that time. And those who actuelly receive the doctorate are a 
very select sub-set of those who took the GRE examinations those many 
years previously. There are other reasons not to pay much attention to 
the coincidence (e.g. due to practices in the U.S. labor market, native 
graduate students in Engineering do not tend to seek the Ph.D., rather 
stop at the Haster's level), but the exercise raises what may be the 
most critical variable of all in explaining trends in performance on all 
the examinations under consideration: undergraduate major. 



P«S« 30 



16. fttplaiiiiag thm Chanm, Bound Ai OndTgfduatt Major 

Th« baile dnograph^c ▼arlabloo 41d not holp ut ozplain tho tronds in 
tcit icoroi o£ eoXUgo grnduatoi. ind eht Inat of thogo ▼nrl«blM-> 
citlsenihlp and natlva languaga-*-^lla of faring aoM potantial, alao 
provad unaatiafaetory until va introducad a Bon-daaographic \rariabla, 
tha nacura of foraal aehooling aa raflactad in nadartvaduata aajor. 

It atanda to raaaon that paopla who take azaBinationa for adaiaaion %o 
graduata or profaaaional achool bring a trasandoua aaount of 
intallaetual training to tha taating roon. flhathar thay ara taking an 
asanination eovaring knovladge and nathoda in a apaeifie f iaid auch aa 
Biology or Biatory, or an axaaination that aaaka to alicit facility in 
cartain ganaral aodaa of thought, a.g. .induetiva problaa-aolving on tha 
GMAXa or daduetiva raaaoning on tha LSATa, atudanta bring anywhere from 
15-20 yeara of foraal aehooling to tha axanination. While the effecta 
of foraal aehooling ara alwaya eunulativa, what aeholara of rhetoric and 
propaganda call the "recency effect" la rather atrong whan it conea to 
parforoance on an exaaination. That ia, whatever you have been atudying 
or doing during the few yeara before you take an exaaination will have a 
auch atronger bearing on your parforaanca than earlier aehooling or. 
experience. For aoat teat-takera, that "whatever" la the undergraduate 
aajor. 

To be aure, there are other influencea, aoaa having to do with foraal 
aehooling, othera having to do with work experience. For 232 of the GRE 
teat-takera (and 14Z of the LSAT teat-takera) enrolled in graduate 
echool in 1981-82, the graduate prograa aight have great effect. For 
the 50Z of the GHAT teat-takera who have two or aorc yeara of work 
experience (the Graduate Sehoola of Buaineaa and Manageaent encourage 
practical work experience aaong applicanta for adaiaaion), the 
particular context and nature^ of that work aight have conaiderable 
Influence on'teat parforaance. And certainly all of ua can cite aloppy 
intervening variablea In our Uvea that aight influence our perforaance 
on different aeaaurea of learned ability or apaeifie field knowledge. 
But few of ua expend aa auch tlae or effort on any aapeet of our 
learning aa we do on our undergraduate aaJor. Certainly, that experience 
ahould reault in obaervable effecta on exaainaiiona auch aa theae. 

The analyaia of teat perforaance by undergraduate aaJor is preaented in 
Table^F, firat by year and then by aaJor. Since a nuaber of catagoriee 
repre8)B|it aggregated aeorca, 1 could not uae Standard Deviation Units to 
iadlcat^ dlfferencea in a aeeondary data analyaia auch aa thia. 
Therefore!; and with aoae reluctance, I choae to aak the queation: p 

"By whai percent did the acan acorc of thoae who aajored in 

differ froB the aean acore of all teat-takera who 

Tdentlfied ^heir undergraduate aaJor?" 

Thla queation waa aiked with reference to 30 different categories of 
undergraduate aajor on the tests of general learned ability (GRE/V, 
GRE/Q, LSAT, and'GMAT)*^^ and for the yeara for which the data waa 



^The HCAT la excluded froa thia analyaia becauae it reports for only 
seven broad catagoriee of undergraduate aaJor. 

ERIC \ 



page 31 



mllAble for all three tests (1977-1982). Ifhat tan one conclude? And 
what hypotheses can one offer for further exploration? 

Conclusion #1 ; With the exception of Engineering inajors, under- 
graduates who najor in professional and occupational fields 
consistently uuderperfom those vho najor in traditional arts 
and sciences fields on these exaalnatlons. 

Table F covers fields that are common In the data reporting of the three 
examinations In question. Hence* by "professional fields," I am 
specifically referring to Business Administration (and Its allied or 
sub-fields) » Education, Social Work, and Journalism. U.S. citizen 
test-takers vho Identified their major In one of these professional 
fields accounted for approximately 30Z of U.S. citizens vho Identified 
their major for any of the 30 fields listed. 

In addition, one can turn to the GRE and LSAT data for such fields (not 
listed in Table F) as Health Administration, Pharmacy, Agriculture, 
Nutrition, etc. and find the same pattern of underp erformance (Nursing 
and Architecture— both of vhlch are programs that often run more than 
four years— are exceptions). In 1981-82, these other fields vould have 
added approximately 8*000 test-takers to the professional/occupational 
category, and increased the overall representation of undergraduate 
professional/occupational field majors to 32Z (Nursing and Architecture 
vould add 9,800 test-takers, and bring the professional/occupational 
proportion up to 34Z of those vho identified undergraduate major). 

Nov this group comprises a substantial and grovlhg portion of the 
test-takers. Why undergraduate professional /occupational majors have an 
almost exclusive purchase on the bottom of the performance barrel on 
these examinations is not hard to aee. As recent ACT data on entering 
college freshmen hsve demonstrated, most of these majors (again. 
Nursing, Architecture and Engineering excepted) do not attract the best 
students (see Table L). But this "input" explanation, by Itself, is 
insufficient. Driven by the requirements of specialized accrediting 
bodies, the curricula in many of these areas tend to be confined to very, 
fev fields, none of vhlch require the exercise or development of the 
verbal skills necessary to perform veil on examinations such as these. 
Nor do any of these professional/occupational "disciplines" have strong 
knowledge paradigms, structures that require the rigorous exercise of 
analysis and synthesis that is so often reflected on the tests. 

I realize this is a sveeplng statement, and one that is part of an old 
argument in American higher education. But the test data support this 
Inference. One example should suffice. Let us compare the performance 
of Economics majors vlth that of Business majors on the QSAT, an 
examination that indicates vhether a student is prepared to undertake 
graduate vork in Business Administration. As Tables, F 4-8 and F-13 
show. Economics majors rank in the top performing groups on the GMAT 
vhile the various Business majors rank at the bottom. To be sure, 
nearly five times as many Business majors take the GHATs as do Economics 
majors (though Economics majors are more likely to take the examination 



37 



page 32 



than Business majors). But the "conventional wisdom" about numbers and 
test scores Is not be sufficient to explain the extreme differential in 
performance* 

Economics is a discipline with a strong knowledge paradigm^ emphasizes 
abstract models and theory, requires research, and seeks predictive 
knowledge. It is a basic discipline. We know that undergraduate 
Business majors (the GHAT categories include 'Accounting, Finance, 
Marketing, Management, Hotel Administration, Real Estate, and others) 
are required to take some Economics, though the core requirements are 
rather minimal. The accreditation guidelines of the American Assembly 
of Collegiate Schools of Business are not even specific in this regard.^ 
(26) At the same time, we assiune that Business majors receive a good 
deal of training in basic quantitative skills and problem solving, both 
of which are heavily emphasized on the GMATT-^Buf^the contexts for that 
emphasis on the 6MAT— -as in the business world itself—are verbal, draw 
^ pn theory and models, require a broad knowledge of the world and 
political and cultural forces in American society, and challenge 
students to demonstrate prowess in the kind of reasoning in which the 
researcher engages. None of these contexts are developed in specialized 
programs in Business Administration (or allied fields). And when 
departments follow accreditation guidelines and students take a minimum 
of one-half of their credits in Business Administration, it is no wonder 
that; Business majors do not perform well on an examination used in the 
admissions process to graduate schools of Business. ^ 

Hypothesis : The greater the proportion of test-takers whose 
undergraduate majors were in professional/occupational fields, 
the lower the test scores. 

It would be reassuring to unlock the mystery of test-score trends solely 
with reference to undergraduate majors, but we do not have the informa- 
tion to do 80 in a secondary analysis. Hence, a hypothesis. 

Vftille it may be the case that test-takers with professional or 
occupational degrees underperfprm others, the effect on mean scores over 
time would not be significant ui^ess the proportion of test- takers from 
these fields: (1) is substantial to begin with, and (2) has risen 
significantly during the period under consideration. 

We know that the propbrtion of bachelor's degrees awarded in 
professional and occupat.ional fields has increased from 51Z in 1971 to 
64Z in 1982, a 25. 5Z increase in eleven years. (27) And we know that 
the proportion of test-takers from the professional and occupational 
fields (exclusive of Engineering) has increased from roughly 29Z in 1977 
to 3AZ in 1982, a 17.2% increase in five years. Common sense will allow 
us to extrapolate from both these trends. 

Even if arts and sciences majors are twice as likely to take these 
examinations as others, we can. say that, since 1971 (and at a minimum), 
there has probably been an increase in the proportion of test-takers who 
hold professional/occupational degrees equivalent to the increase in 



3s 



page 33 



the proportion of the degrees ewerded In those fields* namely 25.5%. 

However, there Is reason to believe that the Increese In the proportion 
of test-tekers whose undergraduate ujors were In professional or 
occupational fleldi has been greater than 25. 5Z since 1971. The 
festest growing exanlnatlon In terms of candidate volume from 1971 to 
1982 wee the GMAT (146Z Increese), and GMAT test-takers ere more than 
twice, es likely to hold prof esslonel /occupational <Yegrees as GRE 
teet-takers. 

This Is a difficult case, and one for the statisticians to confront. 
That Is why I have advanced It as. a hypothesis. Regression analyses are 
necessary to Isolate the effects of a variable such as undergraduate 
major over a period of time during which' tests were rewelghted, rules 
were chenged and, disclosure wee e4ded^o the compllcetlons. And the 
demonstrable effects would have to be statistically significant . Better 
still would be research with primary data that could demonstrate that 
entering freshmen of similar ability who pursue professional/vocational 
V. arts & sciences curricula evidence significantly different degrees of 
achievement on these examinations. 

Conclusion #2 ; Students with undergraduate majors In science, 
mathematics and Engineering perform better than all others on 
these examinations. 

These students perform better not only on quantitatively-based examina- 
tions (as we would expect), but also, comparatively speaking, on 
examinations that rely heevlly on verbal abilities. The LSAT is a case 
in point; ell the eclence majors outperform students with undergrsduate 
majors not only in professional fields but also in fields that 
conventional wisdom ho Id a would be good preparatloL for the law, e.g. 
Political Science and History. And, in this group of majors, only 
Engineers and Computer Scientists perform below average on the 
GRE/Verbal. In feet. Biology and Physics majors outperform Psychology, 
Art and Musid majors on the -GRQ/V. 

How do we explain? Our first cut sounds like an "input approach." 
Undergraduate science and math curricula generelly attract more gifted 
students who are willing to undergo the considerable rigor that those 
disciplines demand, and who probably have higher academic aspirations 
(hence are more likely to take at leaat one of these examinations) . 
That' the etudents may be more gifted la hinted m^by the GRE Subject 
Area Tests, where the long-term trends in scores in the sciences and 
mathematics sre far more poaltive than thoae in other fields. 

But when one looks at the list of majors that conalstently outperform 
all others ecross these exeminations (see Table F-I3), one notices some 
non-scientific fields, e.g. Philosophy and Economics, and hence cannot 
be wholly comfortable with the "input** explenetion. Considering the 
addition of those fields, Conclusipn #2 should be modified. 

Conclusion #3 ; Sti^dente who major in a field characterized by 
formal thought, atructural relationahips, abstract models, 
symbolic Isnguages, and' deductive reasoning consistently out- 
perform others on these exeminations. 



^ 



page 34 



MatheB|§tlct, Economics » Philosophy, Chciblstry, Engine«rlng«-«11 of these 
require the use of symbolic languages • all are characterised by 
structural relationships that proceed according to, rules* and all 
require students to exercise the powers of deductive reasoning. To be 
sure , that Is but a very general account of some of the key conon 
..elements of the • knowledge paradigms of those dlsclpllneSf and no doubt 
there will be some quibbling. But both the LSAT and GMATs, In 
particular, require students to exercise the modes of thought implied by 
those paradigms. ' « •> \^ ... 

« 

One can verify this conclusion by breaking up some of the 30 categories 
of majors In order to Isolate others—such as Music and Llngul6ties<— 
that evidence similar knowledge paradigms. Indeed, Music majors, whose 
work in composition and theory' is very structural, symbolic, deductive, 
etc., hold up the scores In the Fine Arts cstegorles. And if one could 
disaggregate the "Foreign Language" category across all three tests. I 
think it is reasonable to' speculate that those who major in the more 
highly inflected and "synthetic" languages .(such as German, Russian, or 
Latin) would perform equally as well as the Mathematics, Economics, 
Philosophy, etc. maJora**and for similar reasons. 

To be sure, Hiunanltles majors (particularly those in English and Foreign 
Languages) perform well above the mean on all these examinations (except 
the GRE/Q). One can explain, in part, with reference to the nature of 
the tests. For example, the GMATs emphasise "analysis of situations" 
questions that call for the close reading of text and the sensitivity to 
nuance to which. English, Foreign Language, History and Philosophy majors 
are accustomed—one reason that they perform comparatively well on the 
GMATs. 

But that is only a half *explanat ion. After all, when one looks back at 
the 6RE Subject Area Tests (Table 0), it appears that, as Eldon Park of 
Graduate Record Office at ETS observed, the virtual bottom has fallen 
out of some of the disciplines (28), and I would include the Humanities 
among them (though the ease concerning the Social Sciences is far more 
severe). What is going on? 

» 

Conclusion #A ; Even as the numbers of their majors decline, the 
Humanities disciplines are wltjiessing their best students going 
on to professional school, not graduate sphool. But we cannot 
reach the aame conclusion concerning the Social Sciences. 

As Table F-9 shows, the percentage of test'^takers on the LSATs, GMATs, 
and GRE/Generals from the various undergraduate Humanities fields has 
remained relatively stable in recent years. At the same time, however. 
Table A demonstrates a precipitous decline in the number of candidates 
taking the 6RE Subject Area Tests in English and History, even while 
performance on these tests plume ted. We can thus infer that a greater 
percentage of Humanities majors are moving into professional schools 
following college. 

Consider the evidence: <the test scores on the LSAT are up, those on the 
OIAT are relatively stable, and those on the GRE/Verbal and Subject Are^ 



40 



page 35 



(French, Hlftory, English) are down. While the^.percentage of ¥ibanltles 
majors amdng the LSAT and 6MAT test-taking population Is not su£flcle;:it 
to detemlne trends In scores* ^o doubt the shift has eontributgipea , 
those trends. It say be worth adding that aven though they cdtorlse 
only 32 of the first-time exanlnees on the MCATs, Humanities majors have 
outperformed all others (Including Biological Sciences majors) on the 
Biology sub-test since 1980 (and? outperformed all others on the Reading 
sub-test since the new KCATa were Introduced In 1977). 

" ■ ■ 

Test results for majors In the Social Sciences are much more variable , 
and do not allow for a similar interpretation. Social Science majors 
comprised 24. 7Z of those who identified their major in 1981-82 (down 
from 27. 6X in 1977-78). As In the case of Enislish and Ellstory, the 
steepest declines 'In test participation occurred in subjects for which 
performance on' the ORE Subject Area Tests fell most dramatically during 
even that short reliant period: 

Decline in ' Change in 
Participation, Test Performance* 
■ , 1977-1982 1975-1982 (S.D.U.) 

ORE Subject Area Test 

Economics 5.7Z i-. LS 

Psychology 11.8% +.'01 

Political Science 34.2% -.13 

Sociology 44.8% -.20 

Our hypothesis concerning the movement of the better English, History 
and other humanities majors to graduate work in the professions does not 
bear fruit here. Neither Political Science nor Sociology majors perform 
comparatively well on either the LSAT or the GHAT. One might conclude 
that the better students who major in those disciplines are not 
participating at all In these examination programs. 



17. General Conclusions 

Vhat should we make of all this? The standardized test scores of 
college graduates (and soon-to-be-graduates) have declined. Even in the 
most recent period covered by this analysis, 1976-1982, ^^^cores generally 
continued to decline, though with modest slopes and with noted 
exceptions in professional and aome quantitatively oriented fields. 

It bears repeating that in no place in this paper do I mean to imply 
that trends in scores on standardized tests should be the principal 
Indicator of quality In American higher education. It is rhetorically 
discomforting to repeat that statement too many times. 

Nor should anyone take this analysis to be definitive concerning the 
causes of change in test performance over time. Student perceptions of 
the necessity of graduate or professional education in relation to their 
understanding of the labor market, for example, may be significant in 
determining the composition of the test-taking population (and that, in 
turn, may influence our Judgment of the aggregate measures of 
performance). Such hypotheses. concerning external Influences are 



41 



page 36 



nuaerous, but the data are aparac, and thla paper doaa not pretend to 
tread in territory vhere the roadalgna are few. What we have here, 
Inateadf are hypotheaea grounded In data Internal to the tea ting 
proceaa, that aeem to aake aenae, but that ahould be put to the teat of 
atatiatieal analyala by othera who are better qualified to do ao and who 
have acceaa to the prlaary data. 

' ■ • ■ . • • - 

Hy general coneluaion Bay aound like a begging of the queation: ve need 
that reacarch, and vith particular attention to two phenoaena that have 
been atreaaed in thla apalyaia: 

The f irat of theae qovera the threi^-. perioda of rate end direction of 
change in the acorea> along with explanationa for the "turning points," 
explanations related principally to mat tare of test content, content 
weighting, adminia tret ion and score reporting. Thia phenonenon has 
nothing to-do with the teat-taking population. 

The second covers change in the, teat-taking population in terns of the 
undergraduate najor and acadenic experience of test-takera> in 
combination with controla for ability based on SAT or ACT scores. Of 
all the factors that influence performance oh these examinationa> theae 
appear to be the moat perauaslve. Hone of the other variables we 
examined— numbera of teat-tak^ra , age, race f^> gender, cltisenship, native 
language— seems to mean aa much until it is combined with undergraduate 
major; and controls for ability at the point of college entrance aeem 
very Importent to our understanding of the imjpact of college major on 
atudent performance. There are, of course, statiatical techniques for 
exploring these hypotheses, and I urge the very capable research 
personnel at the testing aetvices to follow through, uaing as much of 
xhe primary data aa they can clean up. ^ 

But there is a great challenge in designing the research on the Impact 
of undergraduate major: it hes to fit the context of those three periods 
of change. For example, if we isolate the period of sharply declining 
test scores (1964-197r'< . we may find that undergraduate major has far 
less of a correlation with performance than demographic variables. The 
results may be completely different within the more recent period 
(1976-1982) of relative atability/modeat decline in test scores. 

It is this more recent period of relative atability/modest decline for 
which we have considerable data on the test-taking population, end i^s 
perhaps as good as any period to begin monitoring the future. But for 
us to monitor well, we need better information and cleaner data. 

18^ A Message to the Teating Services; Gathering Consistent Data 

Throughout this exploration, we have been frustrated by data that has 
been either inconaistently gathered or inconsistently reported. At the 
outset, I pointed out thet the testing services have five 
constituencies, but that they aerve only one when they report data. It 
does no violation to the beaic reaponsibilities of the testing services 
to their primary clients if they elso gather and report much more 
accurate and conaiatent data for uae by the others. 



42 



\ 



page 37 



*Thare arc at least tvo «orc eoBpalllng rtatona for the principal clients 
of the teatlng services to be aore "publlc-splrltedV In this natter, and 
eo go to the additional effort and expense that the gathering and 
accurate reporting of sore detailed data entails. 

o To Mitigate the efifects of the ale^use of tests * As previously 
pointed out» the lay public and policy makers—let alone higher 
education adalnlstrators— 'are going to use test scqrcs In highly 
symbolle ways and will sake aeadenlc Judgments and develop 
academic policies on the basis of their perceptions. -They vill 
take all of these actions whether we like it or not . ' But the 
nature of their perceptions and judgments will result In gross 
misuse of the scores if the data are iivconslstent and incomplete. 

o To stimulate improvement In higher education . That is» if the 
testing programs can gather data that will help faculty and 
administrators think more cmrsfully about the factors that 
Influence studesit acaidemlc achievement and grovtn and about the 
most appropriate and productive ways of assessing that achieve- 
ment and growth* everyone benefits. 

But^the first task is one for the hlstorisns. My experience with the 
LSAT data Indicates that It Is possible to go back in^o the existing 
files, and clean them In ways so that Information can be reported: 

(a) only for those who actually took a particular test In a given 
testing year (end for ^ those who took it more than once, to 
report only the most recent ecore within that year) ; end 

(b) for the same universe of test-tskers on each variable. 

% 

In this way» the testing services could reconstruct their public 
statistical histories to reflect the performance of real people who took 
the examinations, end could provide at least a modicum of consistency 
for future enalyses. 

But cleaning up past data Is not ss Importsnt ss providing full and 
accurate data for the future. The existing Information Is extensive, 
but In/ many ways Intracteble. The suggestions of.fercd below are de- 
signed for the public statisticsl histories that, will be analysed at the 
turn of the next century. , 

The first group of suggestions concerns the adminlstrstlon of 

information: 

1) The adminlatrators of the GREs, GMATs, LSATs, snd MCATs 
should require that all etudents fill Out their respective 
background information quest lonnslres ss s condition of 
tsklng the exsminetlom 

By "all" etudents, I mean all: pre-reglatrants, walk-ins, Monday 
test-takfrs, etc. And et ell adminlat rat Ions: domestic, foreign 
and epecial. Ve cannot afford to work with incomplete data. 



43 



page 38 



tlr« know that tone ttudtntt are elvays auepicioue of questionnaires 
and their potential uses (and abuses) , and that some will always 
resist-«<«ven if ve protect their anonyaity and sign a atatenent of 
assurance that the infoxnation will be used only for research j 
purposes • But ve should insist* 

Now I adttit that this is a very hard line, and one that night prove 
counterproductive. Mm have .a choice between honest and complete 
data ivom an incomplete sample or frivolous and' contaminated data 
from a complete eample. I am an optimiat, and prefet to take the 
risk. Besides, ve* re collecting contaminated data now (e.g. 
self -reported grades), and dcn*t seem to be too voir ied about it. 

2) No irregularities should be included in any data base drawn 
from the tests and/or background queationnairesj. 

If a student cancels a test score within five days of an 
examination (as allowed), the background questionnaire should be 
cancelled as veil. If a pre-registrant fails to show up for a 
particular test adminiatration, the background questionnaire should 
not be included in the data files. And if a student walks out of 
an examination and capitulates on the spot, the background 
information collected by questionnaire ahould likewiee not be * 
included. Unfortunately, all these irregularities (and others) 
turn up in the existing data; and the irregularities do no great 
service., even to the primary clients of that data. 

The second group of suggestions concerns the content of the information 
gathered and reported. The idea is to anticipate interpretive Issues 
and demographic trends, not to wait until they appear and then be caught 
in uncomfortable webs of speculation. 

3) All four examination programs should ask the same questions 
In the aame way about citicenship, native langyage, and 
country In which the examinee's baccalaureate institution is 
located. 

Questions about cltisii|nshlp should distinguish between resident and 
non-resident status. Questions about language should determine not 
only native language, but also primary language of the examinee's 
parents (this is particularly important for U.S. minorities), and 
the language of instruction in the schools and colleges attended. 
For non-resident aliens, it may also be helpful to understand what 
kind of poatsecondary institution they attended (technical 
institute, regional univeraity, national university, agricultural 
college, etc.) 

4) All four examination programs should ask the same questions 
in the same way about the type and characteriatica of under- 
graduate institutions attended by U.S. citizens and resident 
aliens who graduated (or expect to graduate) from U.S. 
colleges . 

In his study of institutional diveraity in higher education, ' 



44 



page 39 



Bobsrt Blrnbaum provldts tone options for institutional 
datilflcatlon that aay ba aosa helpful than the old Carnegie 
typology (or even Ita latest refined edition) . (29) The 
inforaatlon one could gamer by asking these quest ipne ie important 
in light of the changing enrollnent mix in American higher 
education. Our interpretation of test performance by undergraduate 
major* for exaa^let would have been enhanced by empirical evidence 
of where the examinees received their baccalaureate degrees (at 
least by institutional type). 

5) _A11 four examin atioa_PJLagrams should ask the same questions 

in the same way concerning the work experience of examinees. 

Only the GHATs ask any question about work experienete; and cbnfine 
that question to the limple notion of extent. In considering 
performance on teets of general learned abilities, however, it may 
be beneficial to know more about the type of job(s), setting(s) and 
responsibilities. At present, approximately 20Z of the test-taking 
population (exclusive of those who take the MCATs) is over the age 
of 30. That percentage is bound to increase, and intervening 
variables will become even more important to the analysis of . 
trends. 

6) The four examination programs should Jointly develop and 
adopt a comprehensive list of undergraduate majors for use on 
their background information questionnaires. 

The GRE list, with the addition of discriminations among Businees 
fields (Accounting, Finance, Karketing, Kanagement, etc.), and 
trimmed of some existing discriminations that may be a bit too fine 
(e.g. Social Psychology, Parasitology, Slavic Studies) would 
probably do the job. When reporting, data on mean scores. Standard 
Deviations, and candidate volume ahould be listed by both 
individual major and aggregate field (e.g. French/ Modern 
Languages; Finance /Business Administration; Zoology /Biosciences). 
If this information is iiq)ortant now. It may become more important 
in the future if our collegea either conaolidate programs or move 
toward further proliferation. 

7) All four examination programs should add a question to their, 
background information questionnaires that will help deter- 
mine why an individual la taking a particular examination. 

The commentaries on test score enalyses frequently include state- 
ments such as, "students take these examinations for many different 
reasons." But as long as we fail to ask the question, those 
etatements are wholly epeculative. Among the options are graduate 
achool admlasions requirements (or recommendetions), self- 
aaeessment, participation in local program evaluationa at colleges, 
requiremente for the undergraduate degree, requirement a in graduate 
school programs, requests of employers, student perceptions of the 
lebor market etc. I am sure that the examination programs can come 
up with a productive question here. 



45 



page 40 



8) All four examination prograns should refine their defini- 
tions of "age" in reporting data in order to provide accurate 
Information on the educational careers of examinees. 

Here I am indebted to Nancy V. Burton of ETS who, in an internal 
proposal to restructure existing data on the pREs, pointed to the 
critical distinction between age at testing and age at 
baccalaureate. <30) To those two points we should add "age at 
graduate degree" for those who already hold graduate degrees. 

-The third set of suggeiitions concerns longitudinal reei^ch on academic 
performance and the impact of schooling in the United States. Ideally, 
we should key off the High School and Beyond sample (Class of 1980) for 
which college transcripts are currently being coded. The sample has 
been drawn* and the base-line data are there (including high school 
transcripts, SAT/ACT scores, and other performance data). The analysis 
of college transcripts in fine detail (31) will be one of the most 
important breakthroughs in our understanding of the college curriculw 
as experienced by students , and will better enable us to assess ^ 
performance on one or more of the major examinations used in graduate 
and professional school admissions. How much of the original HSB sample 
will remain by the time we get to the examinees is almost beside the 
point if the intent is to establish a true longitudinal study with 
standardized performance measures; but to insure a robust sample, we 
could supplement High School and Beyond with a college cohort selected 
primarily to refjLect the distribution of undergraduate majors in 
different types of institutions. 

Better still would be a parallel undertaking. It is possible to 
identify a sample test-taking population at the point of matriculation 
In college (not high school graduation), which could be traced through 
various undergraduate experiences (academic and otherwise), employment 
and family life, performance on these^ examinations and performance in 
graduate or professional school. A convincing sample of test-takers 
(one which reflects the demographic, ability, and enrollment-mix 
characteriatics of all entering college freshmen, full- and part-time, 
and in all types of institutions) would allow us to control for all 
those intervening or unknown variables to which wa currently turn as 
blind excuses for trends that we cannot seem to explain otherwise. Since 
we would be able to ask our experimental group to sit for a number of 
examinations, we would also be able to test the validity of grouping the 
6RE/General examinations, GMAT and LSAT together as tests of "general 
learned ability" 

19. A Message to the Commentators; a Plea Against Excuses 

Any observer of discussions of test scores over the past two decades 
cannot help but noticing a very strange phenomenon. . Whenever it is 
apparent that scores on standardized tests have declined, the testing 
services, examination governing boards, college administrators and 
(sometimes) the media rush to out-do each other with explanations so 
intense and abstruse that they appear as excuses. Whenever scores rise, 
however, no such explanations are offered. 

The products and services of the testing industry may not be ideal, but 
are of generally high quality. Certainly we have been flooded with 

46 



page 41 



•nough studies of reliability and validity to convince ua that the 
products and services are worth what ve pay for then. Paradoxically, 
our excuse-«ongering for performance casts unnecessary doubts on the 
quality of the tests. Whenever the testing services, governing boards, 
deans, admlasion officers, etc. engage in excuse-monger ing, they seen to 
be saying, "Don*t take the product seriously!" Miat strange behavior! 

An analogy from the business world nay be appropriate. The behavior of 
the groups that control and opsrate the tests is like that of a CEO who 
has to report a quarterly loss to stockholders, and starts reciting 
arcane accounting jargon about write-offs and deferred maintenance, let 
alone less srcane observations about strikes or energy costs or special 
R&D investments. Yet if that same CEO presents a quarterly report 
evidencing an outstanding gain in net earnings, it is often left to 
• individual stockholders (let alone securities analysts or SEC investi- 
gators) to detemine how much of the gain was caused by extraordinary 
items or unusual circumstances. None of these explanations (offered or 
hidden) have anything to do with the quality of the basic products or 
services sold by the company, nor do they have much to do with operating 
income . 

If the test scores decline, we blane everything but the quality of . 
student leamlrg. If the scores go up, any commentary is complimentary 
to schoolc, rjileges, and students. This contradictory behavior does no 
great service to students — let alone anyone else associated with the 
education enterprise. In the course of this paper, we found that a 
number of the standard nostrums used in the excuse process do not have 
"fsce validity" (or, st th% least, found no evidence £o suggest their 
validity). But we slso pointed out that it Is Important to understand 
vhy performance on these meaaures changes over time. 

It appears that what many commentators fear — and why they are defensive 
concerning declining scorec—ls misuse of test results. This is a Just 
fear, and I share it. But lit: is not an excuse for excuses . As the 
British philosopher, J. L. Austin, has observed, an excuse" Involves a 
situation "where someone is said to have done aomethtng which is bad, 
wrong, inept, unwelcome, or . . . untoward. Thereupon he or someone on 
hlb behalf will try to defend his conduct or get him out of it." (32) 
This situation, normally a province of ethics and/or the law, is 
somewhat out of place in the realm of academic performance. And 
"explanations" for "inept" behavior which often are used in the realm of 
academic performance are dominated by conditions, exemptions and 
qualifications that sound too much like excuses (at least in their 
linguistic properties). The problem is that we use these "explanations" 
as if they were statements of cause— which they are not. 

Instead of excuses, the justified fear of misuse of test results should 
be a spur to the search for the factors within our control that are most 
likely to influence student learning. An explanation of test scores 
^.hat focuses on something other than teaching, curriculum and learning 
is an excuse that will distract the energies and efforts of those who 
mfike educational policy, and will probably turn those energies and 



47 



page 42 



•fforts away fron tha f oreaa that aaka a real dlff erenca In the 
profaaaloaal llvaa of collage faculty and the laarnlng of college 
atudenta. 

So I conclude with a plaa for both candor and focus: the acores and 
trends do not explain everything in the world of higher education. None 
of them help ua Measure the development of leadership, artistic talent* 
organisational akills* and peraonal values* for example* that we expect 
colleges to advance in different atudenta and upon which the national 
culture and polity depend. At the aame time* the meaaures are proven 
ones* and the results are Important indicators of student learning. As 
Alexander As tin remarked in discussing a i^raft of this paper* the 
remarkably simple but sensible notion that "you learn what you study" 
seems to escape people when they talk about teat acores. I hope it is 
an equally almple but aenalble notion to propose that if we use these 
indicators to raise the level of discussion about the college 
curriculum* we will increase their role in the learning and development 
process. In this way* the. examinations «an become formative tools as 
well as summatlve measures. Let us not turn our attention away from that 
challenge, for atudent learning is the bottom line of our business. 



48 



page 43 



KOTES 

1. (Faga I) Adclman, Clifford. "The Major Seventh: Standarda as a 
Leading Tone In Higher Education," In J. R. Warren (ed.)i Meeting the 
New Denand for Standarda , San Francisco: Jossey-Baaa* 1983* pp. 39-54. 

2. (Fage 2) Resnlck, Lauren and Daniel Resnlck. "Standards. Curriculum, 
and Ferlomance: an Historical and Conparatlve Ferspectlve." 
Cosmlssloned paper for the National Conmlaslon on Excellence In 
Education, l982o ED 1227-104, p.l. 

3. (Fage 4) Linda Cole, Statistical Coordinator for the College Board 
Achlevenent Tests at ETS, has performed such studies on Math: Level 1, 
Math: Level 2, Biology, Chemistry, Fhyslcs, Aaerlcan History and English 
Composition, using the 1980 scale on vhlch to measure the effects of 
rescallngs since 1972-73. The author has examined a' handwritten 
summary. In table form, of the reaults of these studies. 

4. (Fage 7) Jerllee Grandy. Profiles of Frospectlve Humanities Majors; 
1975-1983 . Princeton, N.J.: Educational Testing Service, 1984. Final 
Report f OP-20119-83 (National Endowment for the Humanities), p. 70. 

5. (Fage 8) The figures are extrapolated from the following: 

Digest of Education Statistics. 1983-84 . Washington, D.C.: National 
Center for Education Statistics, 1984, Tab lea 81, 86, and 114, pp. 
95, 100, and 132. 

The Condition of Education, 1982 . Washington, D.C.: National Center 
for Education Statistics, 1983, Table 4.5, p. 134. 

6. (Fage 8) William Tumbull, personal correspondence with the author. 

7. (PagjB 10) The substance of this paragraph la drawn from both personal 
correspondence and conversations between Jonathan Warren and the author. 

8. (Fage 10) Committee of Examiners for the Folltlcal Science Test. 6RE 
Examinations t 1982-1984: a Description of the Folltlcal Science Test . 
Frlnceton, N.J.: Educational Testing Service, 1982, p. 7. 

9. (Fage 10) Ibid., p. 8. 

10. (Page 11) W. Ann Reynolds, personal correspondance with author. 

11. (Fage 11) Wlgdor, Alexandra K. and Wendell R. Gamer (eds.). Ability 
Testing: Uses, Consequences and Controversies . Waahlngton, D.C.: 
National Academy Freaa, 2 vole, 1982. I, p. 56.^ 

12. (Fage 11) Anne Anastaal, "Abilities and the Measurement of 
Achievement." In William B. Schraeder (ed.). Measuring Achlevenent; 
Progress Over e Decade . San Francisco: Jossey-Baaa, 1980, p. 8. 

13. (Fage 12) Nancy Burton, coBsents to the author on a previous draft 
of this paper. 



49 




page 44 



ERIC 



14.' (Page 12) Conslttee of ExaalnsTS for the Economics Test. GRE 
Exaninatlons* 1982-1984; • Pescription of the Econoaics Test. 
Princeton* N.J.: Educational Testing Service, 1982, p. 7. 



15. (Page 12) The literature on the predictive validity of thi^se 
examinations Is large and coaqplex. See Richard 0. Fortna* Annotated 
Bibliography of the Graduate Record Examinations . Princeton, N.' 
Educational Testing Service, 1980. For a convincing synthesis of 
arguments supporting the predictive validity of some of these test^^ see 
Robert L. Linn, "Admissions Testing on Trial," American Psychologist 
March, 1982, pp. 279-291. The arguments, evidence, and controversies 
surrounding predictive validity lie beyond the acope and purpose of th 
paper. 

16. (Page 16) Bowen, Howard R. Investment in Learning . San 
Francisco: Jossey-Bass» 1977, p. 98.. 

17. (Page 18)' It is surprising how sparse the literature is on this 
issue. In a 1961-65 study, Astin used the National Merit Scholarship 
Qualifying Test (the PSAT) as a control for 669 students who took the 
now-discontinued GRE general area tests (in Humanities, Social Sciences, 
and Natural Sciences) as college seniors. Astin' s objectives were to 
aeaaure the impact of the selectivity of institutional environment on 
GRE scores. What he found. was that, as soon at the controls were 
applied, the correlations were negative. At the same time, though, 
stepwise linear regression analyses demonstrated that of all student 
characteristics on entrance to college, "the moat important single 
determinant" of the GRE scores was "academic ability as measured during 
high school" (by the NMSQT). Alexander W. Astin, "Undergraduate 
Achievement and Institutional Excellence." Science , vol. 161, no. 3842 
(1968), pp. 661-668. 

18. (Page 18) William Tumbull, "Project TRACE: a Prospectus." 
Unpublished paper, n.d. ' 

19. (Page 19) When the GRE/Undergraduate Assessment Program came on 
stream as a systematic assessment service in 1976-77, it did not 
reintroduce the local scoring option for the GRE Subject Area Tests. 

But it did offer individual students participating tfi the UAP the option 
of placing' their acores in the GRE "hiatory f lie. The 6RE/UAP program, 
offered Independently of the GRE Testing Program, does not seem to have 
had any noticeable effects on candidate volume on the Subject Area Tests 
for the years 1976-77 through 1978-79. See Undergraduate Assessment 
Program Council, UAP Guide . Princeton, N.J.: Educational Testing 
Service, 1976, p. 18. 

20. (Page 21) Grandy, p. 72. 

21. (Page 22) Wlgdor and Garner, I, p. 189. 

22. (Page 23) The New York Times , May 7, 1981. 

23. (Page 24) Tumbull, personal correspondence with author. 

50 




page 45 



24. (Page 27) Dlgeet of Education Sf tletics, 1983-84. Table 104, 
p. 126. 

25. (Page 29) Syveraon, Peter D. Summary Report; 1980 Doctorate 
RecipientB from U.S> Univetaltiee . Washington* D.C.: National Academy 
Press* 1981. Table 2r pp. 30-31. 

26. (Page 32) In fact* the Guidelines for Curriculum state only that 
"40 to 60 percent of the coursework In the baccalaureate program shall 
be devoted to studies In business administration and economics," but 
that "the major portion . . . shall be In business administration." The 
only economics course listed as an example Is "principles of economics." 

27. (Page 32) Baker* Curtis 0. Earned Degrees Conferred: an 
Examination of Recent Trends. Washington* D.C.: National Center for 
Education Statistics* 1981* Table 7, p. 27. 

28. (Page 34) Eldon Park* personal communication with author. 

29. (Page 39) Blmbaum, Robert. Diversity in American Higher 
Education . New York: Institute for Higher Education, Teachers College, 
Columbia University, 1982. Pinal report of Grant #G-81-0058 (National 
Institute of Education). 

30. (Page 40) Nancy W. Burton. Trends in the Performance and 
Participation of Potential Graduate School Applicants. Princeton, N.J.: 
Educational Testing Service* 1982. GREB 82-5* p. 12. 

31. (Page 40) The only analysis of college transcripts currently in 
the literature is that by Robert Blackburn, et. al. , Changing Practices 
in Undergraduate Education . Berkeley, Calif.: Carnegie Council on 
Policy Studies in Higher Education, 1976. Blackburn and his colleagues 
surveyed a stratified sample of 271 institutions and analyzed 
transcripts at ten in order to demonstrate the distinctions between 
formal degree requirements and the actual student experience of the 
curriculum. 

32. (Page 41) J. L* Austin, "A Plea for Excuses." In V. C. Chappel 
(ed.). Ordinary Language . Englevood Cliffs, N.J.: Prentice-Hall, 1964, 
p. 42. 



page 46 



ilBLIOGlAFHT 

AdtlUB. Clifford. **Tht Ktjor Scvmtbi Standsrdt Uadlag Tont In 
Bightr Bdueatlen.*' Xb J. I. Varrtn (ad.). Kf ting tU Wiw Demand 
for StindTdi . 8aa fmeiteo: Jeiioj-Bits* 1983* fp» 39-M. 

AltMn, lobort A. and ?«ul V. HoXXud. ^ ^ Patn Collocf d froa 
CMduatt loeord at—inationo Toat^Takori Durint i97S«76. Prineoton, 
M.J.: Idueational Totting Sortico, .1977. W^. 

AltMn, lobort A. A Statory of Doto Colltcttd fron Groduoto Ucoyd 

Btmlnationi Toit^Tokoro During 1976-77. Frinctton, li.J.: gd»!i^)ltiqnil 
Totting 8orvico» 1977. 



for 



Aaorictn Collogt Tottinr^rogran. Collogo Studont Frofi lot; Homo 
tht kcr Attotftnt.^ Zovt City* Zovt: ACT* 1983. 

Andoraon, CharUaJ. 8tudant Quality in tho Hunanitiot; Qpiniont of 
Sanior Acidonic Officiala. Vaataington. S.C.s AMrican Council on 
Edueation* 1984 (Bighar Education Paaal Baport Busbar 30) . 




in* Alaxandar* V. **Ondargradnata AehlavaMnt and Znatitutional 
Eseallancae" fcianca. vol. 161* no. 3842 (1968)* pp. 661-668. 

i 

Atalaak* Frank J. Student Quality in tht Sciancaa and Enginaoringt 
Qpiniona of Sanior Acadamic Officialt . Vathington* B.C.: Aaarican 
Council on Education^ 1984 (Bighar Education Fanal Baport Bunbar 58) . 

Baird* Laonard. "Biographical and Educational Cerralataa of Graduata 
and Frograaaional School Adaiaaiona Taat Scorat.** Educational and 
Fayehological Maaauranant . vol. 36 (1976)* pp. 415-520. 

ft 

Bakar. Curtia 0. Eamad Paaraaa Confarrad; an Exanination of Bacant 
Trandt . Waahington* B.C.: Batioaal Cantar for Education Statiatica* 

1981. 

BimbauB. Bobart. Divaraity in Aiarican Biahtr Education . Bav York: 
Inatituta for Bighar Education, Taachart Collaga, Colunbia 
Bnivaraity* 1982. (Final Uport for MIE Grant #6-81-0058). 

Blackburn, Bobart at al. Changing Fracticaa in Padargrad uata Education, 
Barkal4y, Calif.: Camagia Council for Policy 8tudiaa in Bighar 
Education, 1976. 

Bovcn, Bovard B. Inyaatnant in Laaming. San Franciaco: Joaaay-Baa», 

1977. 

■' • « 
Buroa, Oacar K., "Fifty Taara in Taatiag: SoaM Baniniaeancaa, 

Critieiaaa, and Suggaationa.** Educational Baaaarchar* ?ol. 6, no. 7 
(1977)* pp. 9-15. 

Burton, Bancy W. Trandt in tha Parfor»anca and Particip ation of Poten- 
tial Graduata School Aoplicanta. Princeton, B.J.: Educational 
Taating Sarvicaa* 1982. GBEB 82-5. 

52 



page 47 



CoDrad, Uada, DeaaU TriratB a&d lath MilUr. Craduttt Ucord &tttipa» 
tioM Ttchttical Mirnial . Friaettoa, V. J. : EduesCioBAl Ttstiag 
8mie«, 1977. 

Cronbaeh* U« J. **rivt Oaeadtt ef Public CoBtrev«ny Ovtr Mtotal 
Ttstiaf.** A— riem Paychoiogiit . Ja&uAry 1975* pp« 1-14. 

PArguioB, liehard t and E. JaMt Maxcy. Traada to th a Aeadaalc Par- 
fofanca of Hith School and Collaaa Studanta. lova City, la.: 
AMricatt Collaga Taattog Prograa, 1976 UCTUaaarch laport Mo. 70). 

fortaa, Richard 0, Aanotatad Bibliography of the Cra duata Kacord 
BacMiiaationa . Priaeaton, N.J.: Educational Taating Sarvicc, 19fi0. 

Fradarickaoa, Monan and Villiaa C. Vkrd. Davalopaant of Maaauraa for 
tha Study of Craativity . Prtocaton* M.J.: Educational Taating 
Sarvica, 1975. GIEB f72*2P 

Gooditon* Mariana B. A Suaaary of Data Collactad froa Craduata Racord 
Btaainationa Taat-Takare Purina 1980-81. Princaton, M.J*:, 
Educational Taiting Sarviea* 1982. 

Goodiion, Mariana B*. A Suawiry of Data Collactad froa Craduata Racord 
Examinationa Taat-Takara Purina 1981-1982. Princeton* M.J.: 
Educational Taating Sarvica^ 1963. 

Graduate Managraant Adniaaiona Council. Tha Official Guide to the 
Craduata Manaaeaent Adniaaiona Teat . Princeton* M.J.: Educational 
Taattog Service* 1962. • 

Graduate Ucord Exaainationa Board. Graduate Proaraaa and Adateeione 
Manual . Princeton* M.J.: Educational Taating Service* 1972* 1976* 
and 1960. ^ 

Graduate Racord Esaainationa Board. Guide to the Pea of the Graduate 
Record Eaaainationa. Princeton* M.J.: Educational Taating Servtoe* 

1ml 

Grandy* Jerilee. Prof ilea of Proapective Huaanit iaa Malora; 197S-1983, 
Prtocaton* N.J.: Educational Taating Servica* 1984. (Final Report 
for MEH Grant #OP-201 19-83). 

Hartnett* Rodney T. and John A. Centra. "The Effecta^of Acadaaic Pepart- 
aente on Student Learning.** Journal of Hlaher Education , vol. 45* 
no. 5 (1977)* pp. 491-507. 

Kraft* I. "AdBiaaiona to Public Profeeaional Schoola and Adainiatrative 
Openneee.** Journal of Laaal Education , vol. 29 (1977) • pp. 52-61. 

ravine* Murray. '*Tha Acadaaic Aeh^aveaent Teet: Ita Biatorical Contest 
and Social Vunctiona*** Aaerican Paychologiat . March 1976* pp. 

228-2380 

Linn* Robert L. '*Adaiaaiona Teeting on Trial*** Aaerican P avcholoaiat . 
March* 1962* pp. 279-291. 

53 



4 t)age 48 

i 



ERIC 



Minaing* Vlaton B. "Social JTrtadi asd tht Faturt of Ttitlnt, in Seheoli. 
•ad pollnM***'\Vnpttblllhtd pap«r» ' Fall 1983. 

Mufiftld* Battj D. 8citBct. Bntlnttrtag. and lu— nitlti Boctoratta in 
tha Paitad Seataa. 1981 Profila . IteihingtOB, D.C.t Matloaal Acada^y 
Frfaa, 1982. 

Maaaiekt Sasual. "Tha Standard Problta: Maaaing aad.ValttAa iaHtaaura- 
■aar^aad EvaluatloB*" Aaarieaa Paychologiat , Oct. 1975» pp. 9$5*966. 

'«■ 

Hoaanl, Jaaahid. Adult Part icipat ion In Bducatioai Paat Tran^a aad Soae 
Projactiont far tha 1980a . Waahiagton^ D.C.-; Kational Inaifltuta for 
Work aa^ Xfaaralag* 1980. # 

Muaday, Lao\A. aad Jaaana C. Davla. Variaty of Acco«pliahiaant Aftar 
Collaga; Parapactivaa oa tha Htaniag of Acadaaic Talaat . Iowa City, 
Iowa; Aaatleaa Collaga Taatlag Pxograa, 1974 (ACT laaaarch Saport 

Ho. 62). \ ' • . 

\ 

Munday, Lao A. Toward a Stalal Audit of Collaiaa; aa Bgaalaatlon of 
Collaga Studaat Outcoaaa la Tama of Adalaalo tv-^, laf orwatloa . Iowa 
City, Iowa: Aatrleaa Collilga Taatlag Progrn* i97$ (ACT laaaarch . 
laport «o. 75). • : 

Oltaaa, Philip K. Coataat lapraaaatatlwaaaag of tha GRiB Advaacad Tastt 
la Chaalatry, Coaputar Sdaaca aad. Educatloa. -Prlaeatoa, M.J.: 
Educational Taatlag Sarvlca, 1982 (CUB 81-12p)ri 

» ' ' • 

Baanlck, Lauraa aiid Daalal Kaaalek. ''8taadarda,.CttrrlcttluB aad Par for- 
■aaea: aa Blatorlcal aad Coaparaelva Par.epactlva.** ID #227-104, 
1982. 

Rock, Doaald and Charlaa Varta. Aa Aaalytla of Tl— -Ralatad Scora 
lacraaanta wsAioz Dacraaaata for ORE lapaatara i \eroaa Ability aad 
Sa» Groupa. Prlacatoa, M.J.: Idueatloaal Taat ig Sarvlca, 1980. 
GREB 77-9R. ^ 

Scbradar, VllUaa B. (ad.) Mtaaurlag Achiavaaaat; Proaraao Ovar a 
Dacada. Saa Praadaco, Joaaay-Baia, lac, 1980. 

Scbradar, Vllllaa B. (ad.) Adalaaloai Tatting and tha Public lataratt . 
Saa Praadaco: Joaaay-Biaaa, Iac«, 1981. 

Solaoa, Lawla C. Schoollag aad Subaaquant Succaaa: lafluaaca of Ability 
Backgrouad. aad Poraal Bducatloa . Iowa City, Iowa: Aaarlcan Collage 
Taatlag Program, 1973 (ACT Raaaarch Raport Mo. 57). 

SyvaraoB, Pattr D. Su— aary Raport; 1980 Poctorata Radplaata froa 
U.S. Palvaraltlaa . Waahlagtoa, D.C: Matloaal Aeada^y Praas, 1981. 

Tbrabull, VllllaB. "Bdueatloa'a Uport Card: llhat Vlll It Say About 
Collagaat** Uapubllahad pa^ar, Pab. 1984. 

9 

Turabull, Vllllaa. "Taallag Scbraa la Parapaetlva,** Idacatloaal Racord , 

^» ' - • . 

54 ' 



^ page 49 

vol. 59. fto. 4 (1978) . pp. 291'-296. 

Ondtrgraduatt AattaiMiit Fvogram Council. Pndtraroduo tt AiioiWMnt 
froaraa Cuidt . rriacaton. M.J.: Iducational Zaatlag Barvlet, 1976. 

Vlgdor, Alaxandra K. aad Kendall B. Garnar (ada.). Ability taatlngt 
Paaa.^Conaaquancaa aad Cohtrovaraiai . 2 vola. Vaahistton* D.C.: 
Hacloaal Acadaay Fraaa* 1982. 

Vild, Charyl L.' A Stwary of Pata Collaetad fr -i Graduata Ucord i 
BaaaiaatiOB Taat^Takara Purina 1977»78. Ptiacaton, K.J.: Educational 
Taating Sax^tca, 1979. V ^ 

Wild, Charyl A SuMary of Pata Collaetad froa Craduata Raeord 

EMination Taat^Takara Purina 1978"79 . Frincaton, M.J.; gducational 
Taatlitjg Sarviea, 198Q. , ^ 

Viid, Charyl L. A Su—ary of Pata Collaetad from Graduato Record 

Exaainatlon Taat»Tatera Purina 1979-80. Princaton» M.J.; Educational 
Taating Sarvlca» 1981. 

Vild, Charyl L. g««»«^ UaadrdH on laatrueturina tht Grad uata Eacord 
Bxaalnationa iStltuda Taat. Frincaton, H.J.; Educational Taating 
Sarvica, 1979^ 

Wilaon, Xannath M. Tha Validation of GEE S'coraa aa Fradictora of Firat 
. Taar Farf oraanea in"Graduata Study. Frincaton, W.J.; Educational 
Taating Sarvicc, 1979. 6REB 75-8R. 



TABLE At MEAN SCORES. STANDARD DEVIATIONS. AND NUMBER OF TEST TAKERS, 196^-1982 
T*.* Titu I I9(«.6s 1S6S^« l9M-(7 1967-68 I968»(9 1969-70 1970-71 1971-71 1972-73 1973-7^ 197^-75 1975-76 1976-77 1977-78 1978-79 1979-80 198O-8I I98I-B2 





IMmi 


N.A. 


US 


«B6 


US 


«81 




s.o. 


N.A. 


. 96 


98 


99 


103 




N 


N.A. 


«0.153 


«3.652 


57.567 


67.267 


IMT 


MMrt 


510 


' 511 


51« 


516 


516 




S.O. 


N.A. 




N.A. 


• N.A. 


102 




N 


39.162 


W.776 


^.752 


«9.897 

• 




NCAT/Mad. 


S.O. 


iiw mailt MM 4 




in bath f oawt i 




N 


f.A. 


It.lOS 


22.2ii 


2i«S99 


M98M 



103 



518 
102 



«66 
105 



519 
103 



U2 
106 



521 
102 



his 
105 



522 
\0k 



U3 
107 



527 
\0k 



Ul 
108 



520 
105 



U3 
107 



530* 
109* 



MCAT/Qwn Itoan 
S.O. 

ME/Var MMn 
S.D. 

N 



llaldgy 

(fiK AREA) S.O. 
M 



it and aeala In 1977. Mporting la en a ealandar yaw baaia. 



Oia WCkt aaa changed ^n both fanat md aeala in 1977. Mpocting ia on a ealandav yaar baaia. 



S.O. 



590 



533 
137 

617 

5.228 



520 
125 



528 

133 

610 
115 
6.597 



519 
125 



528 
13« 

613 
Uk 
7.831 



.520 



527 
135 

61« 

Uk 



515 
12« 



132 

613 

112 



503 
123 



516 
132 

603 

111 



*97 
125 



512 
13« 

603 



126 



508 
136 

606 
115 



*97 
125 



512 
135 

619 
110 



^2 
126 



509 
137 

62« 

110 



kn 

125 



508 
137 

623 

110 



«92 

127 



510 
138 

627 
112 



«62 
107 



533* 
107* 



7.98 
2.41 
6.S79 

7.99 
2.S4 

«90 
129 



51« 
139 

629 

113 



«65 



528* 
110* 



463 

105 



536* 
107* 



H2 

104 



539* 
108* 



U7 
|04 

110* 



468 
104 



553* 
110* 



lieleoy 
<NCAT) 



S.O. 
N 



Bidlofy tast «Ma ndt eCftcad an tha MAT ontil }977. Btpwtlnf ia bf ealandar year, 



CiNMUtry Naen 
(fiRE AREA) S.O. 
N 

Chnlstry Naan 

(h:at) S.D. 

N 



, 7.87 

2.39 
S6.S79 

630 
109 
5.268 

7.82 

h eapatata ChMdatry taat aaa not offaiad en tha lAT tmtil 1977. Rtporting ia by oalwdar y^ar., 

S6,S79 



4199 
114 
3.783 



618 
110 
S.919 



615 
104 
4,139 



617 

104. 
4.781 ' 



613 
104 
4.791 



613 
113 
5.411 



618 

117 
5.350 



624 
124 
4.833 



630 
114 
4.535 



634 
115 
4.648 



629 

105 
4.936 



627 

107 
5.058 



8.01 
2.49 
S1.791 


7.40 
2.29 
49.075 


7. 73 

2.53 
49,9a 


7.S0 
2.52 
49,203 


7.74 
2.93 
47,597 


7.91 
2.53 


7,78 
2.39 


7.56 
2.92 


7.45 
2.49 


7.45 
2.47 


484 

128 


476 

130 


474 
131 


473 
128 


469 

130 


286.383 282,481. 27l>t8t 261,855 is6.38l 


518 

135 


517 
135 


522 
JJ6. 


523 
136 


533 
137 


622. 
113 
20.842 


621 
117 
18,795 


619 
115 
16,693 


617 
• t15 
85.002 


616 

114 
14.185 


7.90 
2.39 
51.791 


7.89 
2.31 
49,075 


8.09 
2.57 
49,949 


8.13 
2.51 
49,203 


i.23 
2.S5 
47,597 


624 
108 
5.671 


623 
104 
5>725 • 


618 

105 
5)1422 


615 

103 
4!!9tt 


616 

105 
4.940 


7.75 
2.52 
51.791 


8.06 

2.30 
49,075 


7.91 
.2.49 
49,949 


7.94 
3.52 
49,203 


7.99 
2.49 
47,597 



* Figures for the ISAT from 19M/5 through 197VS wtrt tak^n fron d«u gancraud by the Educational Tasting 
Scrvica and dupllcatad on a shaat antltlad, **LSAT Scora Statistics by Vaar.'* Tha figures for the LSAT from 
1975/6 through I9BI/2 were derived from data analysis performed under, contract to the National Institute of 
Education by the lm$ School Admissions Services. As discussed Ir) the text of this paper, there are significant 
differences between the two data sets* and hence It Is difficult to compare pre-l97S data with post-1975 data.^ 



> 

CO 



> 



56 



BEST Di;- v ;'. 



57 



'lj6*.*5 1365-6* 1966-67 1967-68 1968-69 1969-70 1970-71 1971-72 1972-73 1973-7* 197*-75 1975-76 1976-77 1977-78 1978-79 1979-80 1980-81 1981-82 



Tit TItU fc *a 

? S J .1 ..i J J S J .1 3 ..K 

mam mm r-f ti- i.ooo t«i. «m . W ?g '2 ^ ^ U 11 '« » 

SfrUiiSr .1.18 i.«7 ijl J.w '.'w UDO ».'« «•«" 

i'- s.;ii •♦.;« ^.i" /.ii? •■•» '•»• 

}{l!f««»i'- «.IS? «.K M'5 '.i" »•« '''^^ ♦••^ 

1SWl«.r ..,ft 5.<S S.7ft 5.H7 ».»7 3.5* 

W W m ;s ft5 JI? ftl ?5> .111 

(«Sc ME<J S.B. ^ ^Jl J J M „ j|J „ M „,,!, „.«?, ,,.»,7 1J.JJ7 W.lOO H.I70 17.M5 

mm 'w. 'w W . W W »g WJ •« »|| «5 i; *|j 

(SI MM) S.I). . M . « ,. JJ •! „ ,K „.,S I7.»fl 25.3U H.BtJ 17.2^5 I5.W7 IJ..W 



ERIC 



. 6S2 
3,263 


6%8 

1*5 
3.302 


6S0 
1*8 
3.255 


6*8 

1*7 
3.227 


•4V 

14S 
9.U2 


$77 
9S 
2.89* 


,57% 
91 
2.991 


576 
90 
3.077 


97* 
91 
3.0*7 


STO 

m 

s.sst 


692 

1(0 
3.736 


679 
>1S5 
3.3*2 


696 
1(0 
3.*2* 


695 
163 
3.109 


•M 

159 
3.217 


1 1^ 

7.S35 


591 

• 

8,036 


1l6 

7.7*7 


190 
116 
7.*69 


MS 

lis 
T.4ta 


609 
109 
3.671 


601 
110 
3.68* 


603 
106 
3.600 


108 
3.*36 


•14 

10« 
3.4U 


*71 
,90 
3.*11 


*6* 

3.lS 


^4 
88 

1.792 


U1 
85 
2,512 


4^1 
•• 

1^249 


«S0 
113 
2,875 


*36 

117 
2,617 


*38 
109 
2,2(8 


*27 
109 
1,9(7 


499 

l.sat 


529 
97 
17.270 


530 

97 
16,515 


15.656 


938 SS> 
97 •» 
1«,802 1S,23Y 


«52 
91 

12.3tO 


*51 
89 
10.827 


**9 

90 
9.1^9 


*5S 
90 
7.*(0 


4M 
•9 

5,791 



09 

r- 

m 



\ 

^mt tttto H64-M mS-M l§M*iT 19S7<piS 196»-69 lM9«-70 1970^71 1971*72 1972-73 1973<<7i 1974^55 1979^79 1979*77 1977<»7t 1978^71. 1971*90 1990*tl 1991*19n 



Hlitaqp Mmh 
(OS MBI) i.D. 

M 


S84 
81 
4,746 


581 
87 

7,078 


557 
82 
7,923 


554 
82 

9,213 


548 
82 

9,599 


535 
84 

11,519 


528 
83 

U,026 


527 
85 
9,506 


539- 

ii3 

7,276 


Utnttun NMM 
in m^lM i.D. 

(GM MM) N 


591 
95 
8,533 


588 

94 
8,845 


582 
91 

10, 3M 


572 
91 
12,511 


569 
89 
13,477 


556 
90 
15,016 


546 

91 
14,553 


544 

96 
12,810 


545 
96 
10,909 


(QM MM) i.D. 

N 


569 
98 

1,102 


972 
92 

1,535 


563 
92 
1,850 


685 
93 

3,217 


560 
94 
3,406 


549 
93 
2,431 


531 
91 
2,497 


539 
91 
2,038 


531 
95 
1,793 


NMlOt NMM 
(GMt MtR) 8.0. 

H 


Hot 

Of find 


fMfVt 

IfOOO 


518 

103 
1,207 


517 
100 
1,550 


511 
98 
1,856 


4tl 
99 
2,479 


481 

99 
2,640 


483 

100 
2,681 


494 

95 
2,670 



518 



99 

9,921 



liiat- 
Takmn 



^ 2,783 



913 
93 
5,309 


519 

93 
4,823 


• 510 
82 
4,276 


509 
82 
3,980 


SOS 
82 
3,843 


541 
100 
8,351 


53«^ 
101 
7,035 


532 
101 
6,949 


. 530 
,102 
8,420 


525 
105 

s,9a 


522 
91 
1,435 


519 
■ 90 
1,169 


515 
. 94 
1,029 


514 
90 
994 


507 
90 
1,028 


505 
. 91 
2,931 


499 

.. 95 
2,865 


492 

95 
3,339 


503 
. 97 
3,597 


- 499 
3,496 



509 


502 


507 


90 


77 


78 


3,323 


3,738 


3,458 


521 


930 


521 


99 


UO 


100 


9,557 


4,923 


4,885 



527 522 519 515 514 507 ffWftr tlMft 1,000 

98 91 ■ 90 . 94 90 90 talMta, tteNfom 

,642 1,435 1,169 1,029 994 1,028 Mt ilMla«l8. 

499 505 499 492 503 - 499 500 

95 



94 . 91 90 




\ 

\ 

\ 

\ . 

\ ' 

\ 

\ TABLE B 

Selected Background Data on GRE> LSAT> GMAT. and MCAT Test Takers > 1975-1982 

(All figures are rounded percentages of the Universe used) 

\ 

\ 





Universe 


2 


3 


2 


1 


3 


1 


3 


2 


3 


2 


3 


2 


3 


2 


3 


2 3 


2 


3 


2 3 


2 


3 


1 


3 


2 


3 




(see 
code 
bc<ow): 




19-24 


AGE 


Non-U. S. 
Citizen 


English 
as a 
Second 




FEMALE 






WHITE 




BLACK 




HISPANIC 




Attend 2+Yrs 
Graduate Work Coll . 
School Exper. Grad 






BRE LSAT GHAT* 


GRE GNAT 


5RE GMAT 


GRE LSAT GNAT HCAT 


GRE LSAT GMAT MCAT 


GRE LSAT GMAT MCAT 


GRE LSAT GMAT MCAT 


GRE LSAT GMAT 


MCAT 




l)75-197f 


61 


6 b 


NA 


8 


NA 


6 


NA 


NA 


29 


NA 


27 


87 


** 


NA 


NA 


7 


NA 


NA 


2 ** 


NA 


NA 


22 


11 


NA ' 


28 




1976-1977 


60 


62 


NA 


a 


NA 


6 


NA 


NA 


32 


NA 


27 


87 


** 


NA 


80 


7 


NA 


9 


3 ** 


NA 


5 


23 


12 


m 


34 




1977-1978 


57 


62 


52 


9 


20 


6 


19 


NA 


34 


30 


■^0 


86 


** 


82 


82 


7 


6 


8 


3 ** 




4 


20 


12 


50 


33 




197B-1979 


55 


61 


53 


10 


21 


8 


19 


55 


35 


34 


32 


86 


** 


82 


81 


7 


6 


8 


3 ** 




6 


21 


13 


50 


31 




1979-1980 


53 


58 


52 


11 


21 


8 


20 


56 


37 


36 


34 


86 


83 


82 


80 


7 7 


6 


8 


3 4 




6 


22 


15 


52 


33 




1980-1981 


52 


57 


51 


12 


20 


9 


20 


57 


39 


37 


36 


86 


84 


82 


79 


7 7 


6 


8 


:! 4 




8 


22, 


15 


53 


31 




1981-1982 


52 


57 




13 


21 


10 


21 


57 


39 


38 


37 


86 


84 


83 


78 


7 7 


6 


8 


3 4 




7 


23 


14 


50 


33 



Universe 

#1 » An first time test takers only 
#2 =.An U.S. Citizens only 

#3 ='An test takers (percentages based on respondents only) 
* 19-25 for the GHAT 

4^ Impossible to determine for U.S. Citizens 



The percentage of non-respondents In these years was so high that the residual data should dot be used, AcQoHding to the LSAS analysis 

47. 3X of the test-takers did not respond to the question in 1975-6; 40,3% in 1976-7; 42.6* in 1977-8} and 39.2% in 1978-9. The percentage 
of non-respondents dropped to virtually zero in 1969-80 when the position of the question on the background Information form was bhanged, 
i.e. the question became part of the test-registration Information as opposed to the information tp be provided to law schools, 



ERIC 



62 



63 



TABLE C 



DEGREE OF CHANGE IN SCORES^ 1964-19B2 (UNLESS OTHERWISE INDICATED ) 



CHANGE in STANDARD DEVIATION UNITS 



Test 

6MAT (196S-82) 

LSAT (1968-74)2 

LSAT (1975-82)2 

MCAT (1977-82) 
Reading 
Quantitative 
Biology 
Chemistry 

GRE: Verbal 

Quantitative 

Biology 

Chemistry 

Physicsg 

Geology 

Hathematlcs 

Engineering 

Economics 

PoUt. Sci. 

Psychology 

Sociology 

Education 

History 

English Lit. 

French 

Music** 



Scale* of 
Reported 
Scores 
Max . Min. 

800 200 



800 

800 

15 



200 
200 



850^ 210. 
845^ 200 



990 
990 
990 
910 
990 
990 
990 
850 
940 
990 
810 
870 
810 
810 
820 



260 
440 
370 
300 
420 
320 
400 
250 
270 
210 
220 
330 
250 
290 
270 



Approximate 

Number 

Questions 

150 

115 

325^ 
60 
60 
55 
70 

5 

IVJ 

210 
150 
100 
200 
66 
150 
160 
170 
200 
200 
200 
190 
230 
190 
200 



Standard 
Deviation 
■S.D. In 
Base Year 

• .'.18 

+.04 
+.21 



-.09 
-.21 
+.15 
+.06 

-.49 
.00 

-.01 
-.11 
+.18 
-.32 
+.30 
-.23 
-.07 
-.96 
-.26 
-.93 
-.29 
-.70 
-.74 
-.65 
-.23 



Standard 
Deviation ■ 
Mean S.D. over 
18 Years Term 

-.16 



N.A. 
N.A. 



N.A. 
N.A. 
N.A. 
N.A. 

-.48 
.00 

-.01 
-.11 
+.17 
-.35 
+.28 
-.22 
-.08 
-1.02 
-.26 
-.96 
-.28 
-.70 
-.72 
-.68 
-.25 



Small decline 

No Change 
Moderate Increase 



Small decline 
Moderate decline 
Small Increase 
No change 

Large decline 
No Change 

No Change 
Small decline 
Small Increase 
Moderate decline 
Moderate increase 
Moderate decline 
No Change 
Extreme decline 
Moderate decline 
Extrenie decline 
Moderate decline 
Large decline 
Large decline 
Large decline 
Moderate decline 



NOTES: 



1. 
2. 



3. 

4. 

S. 
6. 
7. 

8. 



Scales in use since 1976 (with exceptions as noted) 

LSAT data is broken into two periods throughout this paper, corresponding to the 
two different data sets upon which we drew. See the discussion in the ^Jj- Cw.?"" 
tations for the first period use 1968-69 as the base year, as Standard Deviations 
for prior years are unavailable. . ^. . . 

ThJre are 6 sections of the ^JCATs (we use only 4 here). Some of the questions are 
experimental and not counted in scoring; others are scored in «ore than one section 
Average of 7 forms of the GRE General Examination used between 1973 enJ_lJ77 _ 
The number of questions changed in 1976 with the introduction of the GRE/Analytic. 
B«e^rr is 1967-8. Prior to then, there ..re fewer than l.OOO candidates annually 
Terminal year is 1978-9. Since then, there have been less than 1,000 candidates 

Base*year is 1966-7. Prior to then, there were fewer than 1,000 candidates annually 



64 



TABLE D 



ci^Mi^ Taet Scorii Chuif ■ by Standard Dtviation Pnlti* 



♦•40 and above 
♦•20 to ^^39 



♦•10 to ♦.19 



-•09 to *.09 



-.10 to -.19 



-.20 to -.39 



-.40 to -.74 



Paaeriptlva Tara and Taata 

Larga Zneraaaaa: HOME 

Modarata laeraatas: 

MfithaBatiea (6RE Araa Taat) 
LSAT (1975-82) 



Change 



Sull Inci 



Phyalea (GRE) 

Biology (MCAT 6ub-teat; 1977-1982) 

MO CBAH6E: 

Chaaiatry (IfCAT Sub-taat; 1977-1982) 
LSAT (1968-74) 
GRE Quantitative 
Biology (GRE) 
EconoBiea (GRE) 

Saall Decline: 

Reading (MCAT Sub-teat » 1977-1982) 

Cheniatry (GRE) 

GHAT 

Moderate Decline: 

MCAT Quantitative (1977-1982) 
Engineering (GRE) 
Muaic (GRE) 
Paychology (GRE) 
Education (GRE) 
Geology (GRE) 

Large Decline: 

GRE Verbal 
French (GRE) 
Biatory (GRE) 
Engliah Lit. (GRE) 



-•75 and belov Bstrcae Decline: 

Sociology 
Political Science 



♦.28 
♦.21 



♦.17 
♦.15 



♦.07 
♦.04 
.00 
-.01 
-.08 



-.10 
-.11 
-.16 



-.22 
-.22 
-.25 
-.26 
-.28 
-.35 



-.48 
-.68 
-.70 
-.72 



-.96 
-1.02 



*A11 teat-takera, including non-U. S. citisena. 



TABLE D-2 



SUMMARY OF TEST SCORE CHANGES BY STANDARD DEVIATION UNITS 



TABLE 0-2 




Table E ' TURNING POINTS IN GRADUATE RECORD TEST SCORE TRENDS, 1964-1982 Table E 

Abbrevtattons : • - ' 

■ ummm ■ ■ i i i ii i 

M - Mean Score 

S.D. ■ Standard Deviation 

.SOU ■ Chansle In terms of 

Stand. Dev. Units ^ ' 

from previous ^ 

reference year ' \ • 

N - Number of test-taker^ * * . 



TH llM/i 1tH>M ItTP-n 1971-78 197g-n 1973-71 l^/l-iS 1978-76 1976-77 1977-78 Wyi.1981 IWI/f s!?! Mff.f ttS;!!; 

, 130. 128 

' ft 

^ 533 0 
137 137 

.16 .00 
2S6.4k 

... ' 
StabI* 1,4 118 

Stable ||| 

(.16) (.11) 
^ 4.9lt 

696 MS 

Up 159 188 

i% 

647 t|4 
xfiMI. xn 143 

(.06) .17 
3.3k 

S93 

115 112 

.05 (.22) 
7.Sk 



S.D. 
SW 


ISO ' 
114 






*> 


. 4W 

ICO 

(.31) . 
301.1k 






we/Q N 

«• Ira 

sou 


197 
i3.8k 


Stable 




508 
1 jtt 
(.18) 
293. Sk 




o 6 

• 




Hal. n 

• Ik 

sou 

N 


114 

llf 


c 


603 
114 

j.it) 

1t.9k 




t. 


627 
112 
.21 
20.4k 




QiM. R 
S.0. 
SOU 
N 


its 

114 
Tik 


613 

6ai«i 113 
(.13) 
4.8k 






634 
115 
.19 
4.6k 






mtii. N 

S.D. 

sou 


IS1 

1st 

4.31 


641 

9am 154 
(.07) 
7.4k 








6^3 
16? 
.34 
4.1k 




• 

fliyila N 

s.o. 
sou 

N 


623 

134 

4.0k 


til 










6M 
144 

.25 

3.nk 


Enftrf. N . 

s.o. 
sou 

N 


il8 
109 

6.61 


9am 


587 
115 
(.29) 
9.2k 











Stabia 



TABLE Page 2 



Table E-2 



Ntt 



IftS-W l§6§-70 HyO«7l 1971-71 1971-73 1973-7» 197»-75 1975-76 1976-77 1977-78 1978-81 l98l/» S.d! Olff. Chwa« 



Ccpn. M 
S.D. 
SOU 
N 



MS 
127 

••l.1lt 



S8J 
108 

«.9k 



609 61« 

109 Up ' •'108 112 
.26 • .05 

3.7k . 3.5k 



.(.08) 



Set. S.D. 
SOU 
N 

Soel- M 
olofy S.O. 
SOU 
N 

S.O. 

sou 

N 



tlon 



N 

S.O. 

sou 

N 



NItttfry N 
S.O. 
SOU 
N 

Cfl«1lsfi N 
Lit. S.O. 
SOU 
N 



5S) 
96 

2.7k 

S«6 

122 

1.6k 

S56 
91 

S.6k 

Ml 
86 

>♦.. 
6.2k 

81 

mmm 

4.7k 

591 

95 

6.5k 



ObMn, 



Stable 



OdNn 



OOMH 



*79 

91 
(.77) 
5.5k 

«59 

126 
(.70 
f,.2k 

528 
92 
(.31) 
19.6k 

kk^ 

95 
(.M) 
29.H 



96 
(.49) 
12.8k 



489 
88 
.11 
4.4k 

472 
120 
JO 
4.3k 



547 
99 
.03 
9.ek 



455 

93 
.09 
15.5k 

513 
83 

(.63) 
5.3k 



461 -92 
OONH 86 90 

(.32) (1.02) 
2.2k I 

433 -113 
9cm 106 122 

(.33) ' (.96) 
1.6k 

532 -24 
Stib1« 97 • 94 

.04 (.26) 
15.2k 

*?6 -25 
Itible '89 90 

.01 (.28) 
5.6k 

< 

507 ^ -57 
Stable 78 82 

(.07) (.70) 
2.5k 

521 -70 
Stabia 100 97 

(.26) (.72) 
4.9k 1 



TABLE E, Page 3 



Table E-3 



Ttrntnal 





9 


VMr: 


TrMd ' 

19CS-69 19(9-70 


1§70-7I 


1971-72 1972-73 1973-7^ 


l97*-75 1975-76 1976-77 


Trand 
197)1-71 1978-1981 


Vaarr 
1981/2 


Ave. 

S.D. 


Kat 

ft. 
Olff. 


S.D.O. 
ChanM 




* 

N 

t.O. 
SOU 
N 


|8 

1.1k 


Itot 
etblt 




S69 

92 
(.33) 
IKfk 


5S1 
92 
.13 
2.7k 


Stable 


570 
86 
(.12) 
3.<^k 


9* 


-33 


(.35) 


frtnch 




19SVS 


• 


















SQO. 
SOU 
N 


SS9 
96 

1.1k 


OOMH 


S39 

(.30 
2.5k 






50? 

90 
(.35) 
t.Ok 


1978/9 
l«. 
farm, 
year 


91 


-62 


o 

(.68) 


Mutte 




19M/7 






















N 

S.O. 

SOU 

-N 


5ia 

103 
1.2k 


Not 
cabla 


Ml 

(.30 
2.6k 




SOS 
91 

2.8k 


Stabta 


h9k 

90 
(.12) 
2.0k 


96 


•2^ 


(.25) 



i 



ERIC 



\ 



72 



'.t 



73 



•t 



/ 



TURNING POINT YEARS IN GRADUATE RECORD EXAMINATIONS SCORES (change measured tn Standard Deviation Units), 




TABLE F 



! 



TABLE F-1 



TEST PERFORMANCE BY UNDERGRADUATE MAJOR. 1977-1982 

This set of tables demonstrates the percentages by which the mean score of 
test-takers who majored as undergraduates In specific fields differed from 
the mean scores of all test-takers who Indicated their undergraduate major 
on the background questionnaires administered by the LSAT, GMAT and 
GRE General examinations. 

Since the background characteristics (Including undergraduate major) of GMAT 
test-takers are not available prior to 1977, I can present only five (5) recent 
years of data In these tables. 

The data Is presented In two ways: (l) by year (pages F-4 t6 P-8)and (2) by 
major, within general curricular groupings (pages F9 to F 12). 

Here Is a guide to reading the tables: 

The dimensions of the universe of test-takers on which this analysis focuses 
are indicated at the top of the page for each year (F-^ to F-B) . 

Total Test-Takers : LSAT, as reported in the NIE-sponsored analysis, and 

used with the permission of the Law School Admissions 
Council; 

GMAT, as reported In the comprehensive tables provided 
to the author, copyright by the Graduate Management 
Admissions Council and used with Its permission; 

GRE, as reported on a set of tables provided by ETS 
and entitled, "GRE Candidate Volume, Means, and 
Standard Deviations," covering the years, 196^-1983. 
The total number of test-takers on these tables 
differs from that reported in the annual GRE Summa ry. 

Subject Universe : The objective was to winnow out non-U. S. citizens, 

people who had scores reported In a given year but 
who did not take the test that particular year, etc. 

LSAT: tends to be the full number reported in "Total 
Test-Takers" (the LSAT does not ask a question about 
citizenship); 

GMAT: those who responded to the background question 
on citizenship by Indicating that they were U.S. citi 

GRE: all first time test-takers. Mean scores for U.S. 
citizens only would be slightly higher on the GRE/V 
and slightly lower on the GRE/Q. It is possible to 
disaggregate scores for U.S. citizens x first-time 
test-takers for three out of the five years reported. 

Respondents ; The number of individuals in the subject universe who 
identified their undergraduate major field. 

% Repondents/ Self-evident. Why the LSAT percentage is signif Icar.tly 

Universe lower than the others is a mystery. 

( O 



TABLE F-2 



ERIC 



Mean and Standard 
Deviation: 



The Mean and Standard Deviation are Indicated for 
the subject universe only , and thus differ from 
the Mean and Standard Deviation for all test- takers. 



Analysis by Major . No two of these examinations ask the question, "What 
Is your undergraduate major?" the same way. However, It was possible to create 
30 standard categories, with the criterion that to be Included In the analysis, 
each undergraduate major category had to account for 0.5% or more of the 
respondents. < 

Depending on what categories were used on the background questionnaires for 
the Individual examinations, aggregation was often necessary. For example, 
the LSAT lists one category for "Fine Arts" and another for VMusic." The 
GMAT uses only "Fine Arts" (and one assumes that a Music major will check 
that box). The 6RE reports both for Individual majors In the category of 
"Fine Arts" and for the entire category. To render the data comparable 
across the three tests, I aggregated the individual majors employed on the 
LSAT questionnaire, and used the 6RE reporting for the entire category of "Arts." 

Aggregating such data for purposes of comparability across the tests was 
necessary in the following cases: 

LSATs: Fine Arts/Music, Foreign Languages, Other Humanities, Other 

Social Sciences, Other Sciences, Engineering,' and Other^^Business. 
GMATs: None 

GREs : Foreign Languages, Other Humanities, Other Social Sciences, and 
Other Sciences. 

What specific majors are included in the "Other" categories? The references 
below are to the categories used on the background questionnaires for the 
LSATs and GREs: 



"Other Humanities": 



LSATs: Re U (J ion. Archaeology, Other Humanities 
GREs : Religion, Archaeology, Other Humanities, 

Comparative Literature, Art History, Linguistics 



"Other Social Sci.": LSATs: 



GREs 



Geography, American Civilization, Other 
Social Sciences 

Geography, American Studies, Other Social 
Sciences, Communication, International 
Relations, Social Psychology, Urban Development 



"Other Sciences" 



: LSATs: Geology, Astronomy, Other Sciences 
GREs : Geology, Astronomy, Other Physical Science, 

Applied Mathematics, Statistics, Oceanography 

I am sure there will be quarrels with my categories and aggregations here. 
And I confess to being uneasy with a few of them. For example, I have 
aggregated "American Civilization" and '/American Studies" majors with the 
"Other Social Science" category, even though the test-performance of American 
Civ majors Is far superior to that of the other majors In the category. Like- 
wise, Music majors tend to outperform other *'FIne Arts" majors on the examina- 
tions, but I nonetheless lumped all Fine Arts majors together to create a cell 
of significant magnitude. Too, some might quarrel with what I have left out. 
For example, the GMATs have a category for "Statistics" (which I i^gregated 

77 



TABLE F-3 



with "Other Sciences" on the GREs). But the N In that category on the GMATs 
was consistently so smal 1 that I did not think It was worth aggregating. 

ABBREVIATIONS : N.Q. ■ Does not qualify tor inclusion because the number of 

test-takers with thatoparttcular majcr was less than 
0»5% of the subject universe In that year. 

■ Test data do not report separately /or the major field 
Indicated. 

N.I. ■ Not Indicated for the Subject Universe. 



o 



\ 



Tait f«gor—BCt by Padtrtraduate Hajor 



TibU_P-i» 



Jmt 1977-1978 

fetal T«it-Taktrs 

Mbjaet Ualvtrtc 

Issponduiti 

Z lASpoBdi./Unlvcrst 

Mean (Unlvcrst) 

Stand. Davlatlon 



LSAT 

111.555 
111,555 
85.422 

76.6Z 

538 

110 



GMAT 

169.908 
132.385 
131.778 

99. 5Z 

473 

103 



VSR 



CHE 

286.383 
221.745 
217.977 
98. 3Z 



494 
II.I. 



qDAR 



517 



Z by uhlch Kaan for Mai or It Abova (Balow) 
the Mean for the Unlycrae 



TOT, H 

567.846 
465.685 
435.177 
93. 4Z 



Reipond 



402.286 92.4Z 



Major 

Eagliah 
Phlloaop^y 
Arta/Muaic 
Foreign Langa. 
Other Uusanltiea 
Blatory 
Econoalea 
Govemnent 
Political Science 
PaychoJLogy 
Sociology 
Anthropology 
Other Social Sci. 
Blol/Bloaelcnce 
Cheaiatry 
Mathcaatica 
Phyalct 
Other Science 
CoBputer Science 
Engineering 
Accounting. 
Finance 
Markating 
loainaaa Adain. 
Menagaaant/Ind. Man. 
Other Baalnaaa 
Idueatlon 
Joumalian 
Social Vbrk 
Speech 

•Dae to tbunding. percentagea for lafllvldual aajora aay not add up to the total 
indicated. 

"J 



9a*l* 


Da Om 


X"f a "tA 


\V a VM/ 


21.873 


5.0Z 


8a9 


11.2 


18.4 


5.4 


4.255 


1.0 


1.1 


0.6 


1.4 


(7.7) 


11.530 


2.6 


5.6 


4.0 


10.5 


(3.7) 


8.925 


2.1 


5.8 


4.9 


9.4 


(2.3) 


8.839 


2.0 


2.2 


3.4 


8.9 


(5.4) 


21,293 


4.9 


$.7 


6.3 


1.8 


12.8 


17.990 


4.1 


3.2 


6.6") 










(2.0) 


1.5 J 


3.8 


(4.3) 


30.460 


7.0 


2.4 


0.0 


3.8 


(1.7) 


30.039 
' 112.370 


6.9 


(5.8) 


(4.9) 


(3.8) 


(12.4) 


2.8 


7.6 




15.2 


0.0 


2.719 


0.6 


0.2 


0.8 


(0.2) 


(5.0) 


19.753 


4.5 


4.3 


2.3 


3.9 




29.341 


6.7 


8.0 


8.5 


3.8 


21.5 


7.701 


1.8 


13.8 


12.3 


3.0 


29.8 


9.511 


2.2 


M.Q. 


N.Q. 


8.9 


34.0 


3,563 


0.8 


5.9 


2.1 


2.9 


17.4 


7.911 


1.8 




6.8 


1.2 


24.6 


2,907 


0.7 


8.7 


9.5 


(6.5) 


27.3 


25,277 

* V 


5.8 


3.5 


(2.3)^ 










3.0 


(2.1) / 












(7.8) ( 








19.0 


(3.7) 




(11.1) 


(2.1) 


82,599 ' 


(1.7) 


(7.8) \ 










(0.4) 


(6.1)-> 






34,128 


7.8 


(7.1) 


(6.1) 


(12.3) 


(15.3) 


2.6 


5.3 


(7.5) 


2,852 


0.7 


(8.7) 




(9.7) 


(17.0) 


3,404 


0.8 


- (1.9) 




(5.9) 


(li.8) 


3,046 


0.7 



T««t >«fot— net by Pndtriraduatt Itoior 



Table F» 5 



Tw t 1.978-1979 
ft«t 

Total Ttst-T«kcrs 

Mjaet IhilvarM 

iMpendMts 

Z lagponds./Dnlvcrst 

Ifean (Dnlvarta) 

Stand. Dtvlatlon 



LSAT 

98,307 
98,307 
74*343 

75.61 

536 

107 



CMAT 

187,039 
144,089 
143,354 
99. 5Z 

475 . 
104 



GRE 



489 

123 




% by which Itean for Hajor la Abova (Balow) 
tha Kaan for the thiivarae 



/ 



TOT. K 

567,828 
461,076 
431,540 
93. 6Z 



*Z of 
Reapond . 



378,699 88. IZ 



r 



ERIC 



Aigllah 


4.7Z 


6.9Z 


15. 5Z 


(5.m 


20,266 


4.7Z 


Fblloaophy 


8.6 


9.9 


19.4 


5.2 


3,858 


0.9 


Arta/Muaic 


0.1 


(0.2) 


2.2 


(7.2) 


11,154 


2.6 


Foralgn Langa. 


5.6 


4.6 


7.4 


(3.9) 


9,232 


2.1 


Othar Binunltiaa 


6.2 


4.6 


9.2 


(3.5) 


8,507 


2.0 


Blatory 


1.9 


2.9 


10.6 


(5.0) 


19,240 


4.5 


BeonoBlca 


9.5 


5.9 


3.7 


13.6 


18,025 


4.2 


Govammant 


3.2 


5.39 
1.1 J 










Political Sdance 


(1.9) 


4.9 


(3.5) 


28,532 


6.6 


Paycholpgy 


2.1 


(0.2) 


4.3 


(1.9) 


22,779 


5.3 


Sociology 


(6.7) 


(5.3) 


(3.3) 


(12.6) 


11,089 


2.6 


Anthropology 


6.3 




6.3 


(4.3) 


3,304 


0.8 


Othar Social Sci. 


(0.2) 


0.6 


(1.2) 


(4.7) 


20,212 


4.7 


Biol/Bioaciance 


4.1 


2.5 


0.6 


7.8 


30,650 


7.1 


Chaaiatry 


7.8 


8.4 


5.1 


21.9 


7,930 


1.8 


Kathaaatica 


12.9 


14.1 


4.1 


29.7 


8,353 


1.9 


PhyaiCB 


M.Q. 


N.Q. 


11.5 


34.2 


3,294 


0.8 


Othar Scianca 


4.7 


1.3 


3.7 


16.9 


8,479 


2.0 


Coaputar Sciance 




7.4 


1.8 


26.8 


2,969 


0.7 


iDginaaring 


8.9 


,v 9.7 


(4.3) 


27.0 


26,137 


6.1 


Accounting 


4.3 


(2.5)-) 










Pinaaca 


3.4 


(1.7)/ 










Karkatlng 




(7^)1 






71,883 


16.7 


•nainaaa Adain. 


(3.2) 




(9.4) 


(1.7) 


Hanagaaant/Ind. Kan. 


(2.4) 


(7.4) 










Debar Boainaai^ 


(0.5) 


(5.5)-^ 










Bdneatlon 


(6.7) 


(6.3) 


(11.2) 


(14.6) 


33,179 


7.7 


Jcramallaa 


2.1 




7.0 


(6.4) 


2,572 


0.6 


Social Work 


(7.2) 




(9.0) 


(17.1) 


3,346 


0.8 


SiNwoch 


(0.9) 




(9.4) 


C10.5) 


3,709 


0.9 



*])ua to rounding, paraantagaa for Individual 
Indicatad. 

80 



ijora aay not add up to tha total 



Tttt ?Tfof«nct by Cndtriraduatt Major 
Tft t 1979-1980 





LSAI 


QtAT 


VER 


CRE 




TOT. M 


• 


%tftl Tott-Taliors 


94,583 


209,739 


272,281 




576.603 




llAJtet Ualvarto 


94.583 


156,548 




210,749 




461.880 




iMpondonts 


74,365 


155 ,380 




207,713 


t 


436.918 




1 Ittpondt./UBivortt 


78. 6Z 


99. 3Z 




98. 3Z 




94. 6Z 




Mtiui (Unlvorto) 


539 


473 


488 




516 






Stand 0 iMivlAtlOD 


108 


102 


123 




131 






0 




1 








< * 


OS 




I by which Moan for Major la 


Above (Below) 




ReeDond, 


f 


- 


the Moan f oi 


' the Univerae 






• 


Na^ or * 












Jo/ • f 00 


ODeO* 


Englitn 


4.5X 


6.8Z 


1 ^ M 

ideUA 




(5.6Z) 




A 79 


Philosophy 


7.2 


10.8 






5.0 




n 0 

Ue 7 


Artt/iiusic 


0.2 


0.0 


leD 




(7.4) 


1 1 ^07 


9 l« 
4e 0 


Foreign Langi* 


4.3 


4.2 


- 10. J 




(3.3) 


Of XXH 


1 Q 

X e 7 


Otbtr EuBAnltlna 


5.8 


4.2 


A A 

9.2 




(5.0) 


fi O0 1 


9 A 
2eU 


Bittory 


1.7 


4.0 


Av e W 




(5.0) 


1 7 O^C 


A 1 
^e 1 


EcononlcB 


8.0 


6.3 


0.8 




13,0 




A 9 

^e »^ 


Govtmaant 


2.4 


4.47 












Folltlcnl Science 


(2.8) 


' l.lsT 


3.9 




(4.5) 


97 7fll 


Oe J 


Peychology 
Sociology 


0.7 


1.1 


3.3 




(3.1) 


97 -700 


Oel 


(7.2) 


(4.4) 


(4.3) 




(13.0) 


1 A 7on 


9 •* 


antnropoxogy 


3.5 




16.0 




(0.4) 




we ^ 


cicner sociei. sci* 


(1.9) 


0.8 


0.0 




(5.4) 




^ e V 


Biol /Bloedence 


3.5 


3.0 


4.7 




9.7 


97 07 ft 


Ik 9 

o« C 


Chealetry 


6.7 


8.0 


2.9 




20.5 


/ f 30^ 


1.7 


lUthesatlce 


12.6 


14.0 


3.3 




22.6 


fl 109 




Physlce 


H.Q. 


H.Q. 


9.0 




32.6 




Ue O 


Other Science 


2.8 


0.8 


2.9 




17.1 




9 1 
2e 1 


Conputer Science 




5.5 


(1.0) 




25.0 


9 iHOk 

j»oyo 


VeO 


Engineering 


7.8 


10.6 


(6.6) 




27.3 


OO 7^0 


OeO 


Accounting 


3.2 


0.9)) 












MAMMA 

f xnence 


2.3 


(6.8) f 












Iterkatins 












o 


18,1 


Botittoto AdBln. 


(4.6) 




(10.5) 




(i.2) 


79.088 


MinagOMnt/Ind. Man. 


(3.9) 


(7.8) 












Othor BusinoflB 


(0.9) 


(5.1)J 












Bdoeation 


(8.0) 


(6.3) 


(11.7) 




(15.1) 


33.947 


7.8 


Joanulin 


0.6 




5.1 




(8.3) 


2.761 


0.6 


SoeiAl Vbrk 


(9.5) 




(10.0) 




(18.6) 


3.574 


0.8 


Sptoch 


(3.9) 




(5.5) 




(13.0) 


2.670 


0.6 



*I>ue to rounding, percentagea for individual, major a, may not add up to the total 
Indicated. ^ 



81 



Tttt >TgoT— nM by Padtrtradtf tt Haior 



Table F-7 



Turt 1960-1961 











fsnp 

mmmmm 






fdtal T«sc-TalMrs 


100,793 


214.555 




262.855 




578.203 


tttbjaet Onlvacse 


100,793 


156,591 




203.131 




460,515 


-iMpoBdMita 


85,057 


155,030 




198,768 




438,855 


X latpoadt. /Unlvtrte 


84. 4Z 


99. OZ 




97.92 




95. 3Z 


Ktan (Ualvarse) 


544 


478 


486 




520 




Stand. Deviation 


no 


99 


122 




132 





*X of 

X by which Kean for Major ia Above (Below) Retpond 
the Mean for the Unlveree 



Major 386,592 88.3Z 



Insliah 


5.1Z 


6.7Z 


14. 6Z 


(6.2Z) 


20,071 


4.i6Z 


Philoaophy 


8.8 


11.4 


17.9 


3.3 


V' 3,915 


0.9 


Arte /Music 


(0.4) 


(0.8) 


1-1.4 


(8.3) 


11,008 


2.5 


foreign Langa. 


4.7 


4.2. 


9.5 


(3.3) 


7,836 


1.8 


Other Buunitiea 


5.5 


3.8 


8.0 


(4.8) 


8,940 


2.0 


Biatory 


2.9 


4,0 


10.3 . 


(5.6) 


17,504 


4.0 


BeonoBlca 


9.7 


6.9 


0.6 


12.7 


19,150 


4.4 


Govemaent 


3.5 


4.0^ 










Political Science 


(2.0) 


l.OJ 


4.1 


(4.0) 


29,164 


6.6 


Paychology 


0.2 


0.8 


3.3 


(3.7) 


27,692 


6.3 


Sociology 


(7.2) 


(5.2) 


(3.3) 


(13.1) 


8,297 


1.9 


Anthropology 


3.5 




14.8 


(2.3) 


2,334 


0.5 


Other Social Sci. 


(0.9) 


0.2 


(0.2) 


(6.2) 


20,576 


4.7 


Biol/Bioacience 


3.9 


3.3 


4.7 


8.8 


24,654 


5.6 


Cheaiatry 


8.1 


7.9 


3.5 


20.2 


7,433 


1.7 


MatheMtica 


12.7 


13.0 


3.1 


27.9 


7,525 


1.7 


Fhyaica 


M.Q. 


B.Q. 


7.8 


31.3 


3,441 


0.8 


Other Science 


3.2 


0.4 


3.5 


16.0 


10,222 


2.4 


Coifputer Science 




5.0 


(0.6) 


24.0 


4,244 


1.0 


Engineering 


8.3 


9.8 


(7.6) 


26.2 


30,405 


6.9 


Accounting 


2.9 


(1.7)-) 










Finance 


3.1 


(0.4)/ 










Marketing 




(7.7)1 










Baaiaeae Adain. 


(4.2) 




(9.7) 


(1.0) 


82,610 


18.8 


Miaage»int/Xnd. Man. 


(3.9) 


(7.5) \ 








Other Buaineaa 


(1.3) 


(4.8)^ 










Bdueation 


(8.3) 


(6.1) 


(10.7) 


(15.4) 


31,351 


7.1 


Jonmaliaa 


0.9 




4.3 


(9.0) 


3,088 


0.7 


Social Vos:k 


(8.5) 




(9.5) 


(19.2) 


3,629 


0.8 


Speech 


(3.1) 




(6.0) 


(14.4) 


: 2,533 


0.6 



*IHse to rounding, percentagea for individual aajora aay not add up to the total 
indicated. 

82 



TMt fOTfofinct by Pndtrtfduattt Tlilijor 



\ 



\ 



ImmT t 1981-1982 

tiDtal Ttit-T«ktrs 
tvbjtet Ihijlvtrit 

X Itipoads./Univtrat 

Htm (tfnivmt) 
Stand. Dttviation 



LSAT 

99.928 
99,928 
85,198 
85. 3Z 

553 

110, 



CkAT 

203^(304 
139.^64 
138,^46 

99.3; 

482 \ 
97 



\ 



\ 



VER 



482 
123 



Tablt F-8 



GRE 

256.381 
180,798 
174,624 
99. 6Z 



QDiN 



525 
134 



X by uhlch Kaaa for Hal or la Above (Below) 
' the Mean for the Pnlvaree 



TOT. K 

558,613 
420,690 
398,768 
94. 8Z 



*X of 
Reapond . 



Major 

Engllab 
Fbllotopby 
Arta/Muaic 
foreign Langa. 
Other Buaanltiaa 
Hlatory 
Economica 
Govemaent 
Political Science 
Parchology 
Sociology 
Anthropology 
Other Social Sci. 
Biol/Bioicience 
Cbeaiatry 
Matheaatica 
Pbyaica 
Other Science 
Coaputer Science 
Engineering 
Accounting 
Pittance 
Marketing 
Boaineaa ilthainc 
Manageaent/lnd. Man. 
Other Buaineaa 
Education 
Joumaliaa 
''Social Work 
Speech 

•Due to rounding, piercentagea for individual aajora 
indicated. 











353.680 


88. 8Z 


5.6X 


4. IZ 


14. 5Z 


15.. 7*; 






8.7 


11.0 


€ ei ^ 
17 e 6 


4eO 




A Q 
Ue !f 


(0.5) 


(1.2) 


1.7 


l8a*J 


y,o/u 


AeH 


5 7 

^e r 


3.3 


7.9 

• e ^ 


(4.2) 


7,068 


1.8 


4.7 


1.8 


7.3 


(5.0) 


8,341 


2.1 


2.9 


4.6 


10.8 


(5.5) 


15,123 


3.8 


9.6 


7.3 


0.8 


12.4 


17,562 


4.4 


3.3 


4.6? 








6.9 


(1.6) 


0.63 


3.5 


(5.0) 


27,337 


0.9 


0.8 


3.1 


(4.0) 


24,885 


6.2 


(7.0) 


(5.0) 


(5.0) 


(15.0) 


8,693 


2.2 


4.0 




16.4 


XI. 7) 


1,863 


0.5 


(0.9) 


0.3 


(0.4) 


..(7.2) 


20,048 


5.0 


4.0 


3.3 


5.4 


' 8.0 


22,820 


5.7 


7.6 


7.5 


2.1 


18.3 


6,867 


1.7 


12.8 


13.3 


2.7 


26.3 


6,564 


1.6 


H.Q. 


H.Q. 


6.6 


29.5 


3,183 


0.8 


2.8 


0.8 


' 3.5 


14.5 


9vl54 


2.3 




5.^ 


(1.5) 


22.9 


5,035 


1.3 


8.0 


10.0 


(7.3) 


25.1 


29,718 


7.5 


3.4 


(1.5)- 










3.4 


(0.8) 












(8.1) 








19.5 


(4.5) 


' (9.1) 


(2.3) 


77,679 


(5.4) 


(7.7)' 


1 








(0.9) 


(5.0)- 






22,978 


5.8 


(8.7) 


(4,2) 


(10.4) 


(15.8) 


?D.r. 




3.7 


(8.4) 


2,767 


0,7 


(10. I) 




(9.1) 


(20.8) 


2,999 




(2.» 




.(6.0) 


(14^3) 


2,159 


«.5 



■ay not add np to the total 



Gtntral Atm: Hwaanltlte 



Table Indicates the percentage by 
(belov) the Mean for the unlverae 



Major ft Teat 1977"78 , 
Bngllah ; 

LSAT 5.42 

GHAT 6.8 

GRE/Verbal 14.4 

GRE/Quant. (6.0) 

2 of Reepondents: 5.02 

Philosophy t 

LSAT 8.92 

6MAT 11.2 

GRE/Verbal 18.4 

GRE/Quant. ' 5.4 

2 of Respondents: 1.02 

Foreign Languages t 

LSAT 5.02 

GMAT 4.0 

GRE/Verbal 10.5 

GRE/Quant. (3.7) 

2 of Respondents: 2.12 

History : 

LSAT 2.22 

GKAT 3.4 

^ GRE/Verbal \ 8.9 

GRE/Quant. (5.4) 

2 of Respondents: 4.92 

Other Hjpaanities : 

LSAT 5.82 

GHAT 4.9 

GRE/Verbal 9.4 

GRE/Quant. (2.3) 

2 of Reapbndents': 2.02 



which the Mean for the ■ajor.is above 

1978-79 1979-80 1980-81 1981-82 



4.72 
6.9 

w e ^ 

15.5 
(5.6) 


' 4.52 ** 
6.8 
15.0 
(5.6) 


5.12 
6.7 
14.6 
(6.2) 


5.62 
4.1 

~ e • 

14.5 
(5.7) 


4.72 


4.72 


4.62 


4.42 


8.62 
9.9 

^ e ^ 

19.4 
5.2 


7. 22 
. 10. £ 
19.5 
5.0 


8.82 
11.4 
17.9 

3.3 


8.72 
11.0 
17.6 

4.6 


0.92 


0.92 


0.92 


0.92 


5.62 
4.6 

~ e w 

7.4 
(3.9) 


4.32 
4.2 
10.7 
(3.3) 


4.72 
4.1 

~ e A 

9.5 
(3.3) 


5.72 
3.3 
7.9 
(4.2) 


2.12 


1.92 


1.82 


1.82 


1.92 
2^9 
10.6 
(5.0) 


1.72 
4.0 

10.5 
(5.0) 


2.92 
4.0 
10.3 
(5.6) 


2.92 
4.6 
10.8 
(5.5) 


4.52 


4.12 


4.02 


3.82 


6.22 
4.6 
9.2 
(3.5) 


5.82 
4.2 
9.2 
(5.0) 


5.52 
3.7 
8.0 
(4.8) 


4.72 
1.8 
7.3 
(5.0) 


2.02 


2.02 ' 


2.02 


2.12 



tiit Ftrforianea 1»# Waarm^ tijbleJMO 



Ganaral Atm: Social Sciancet 

Table Indleataa tha parcantaga by vhlch tba Maan tha aajor Is above 
(balow) Che Maan for Che unlvarae. 



Majlor ft Teat 


1977-78 


1978-79 

1 


i979-80 


1980-81 


1981-82 




EconoBlca: 












V. 


LSAT 
GMAT 

CRE/Varbal 
GR£/Quant . 


9.7Z 
6.3 
1.8 
12.8 


9.5Z 
5.9 , 
3.7 
13.6 


8.0Z 
6.3 
0.8 
13.0 


9.7Z 
6.9 
0.6 
12.7 


9.6Z 
7.3 
0.8 
12.4 




Z of Raapondenta: 


4.1Z 


4.2Z 


4.3Z 


4.4Z 


4.4Z 




PsycholoRy: 

LSAT 
GMAT 

GRE/Varbal . 
GRE/QuanC . 


2.4Z 
0.0 
3.8 
(1.7) 


2.1Z 
(0.2) 

4.3 
(1.9) 


0.7Z 
1.1 
3.3 
(3.1) 


0.2Z 

0.8 

3.3 

(3.:^) 


' 0.93: 
0.8 
3.1 

(4.0) • 




Z of Raapondenta: 


6.9Z 


5.3Z 


6.4Z 


6.3Z ' 


6.2Z.^ 


~ V* 


Sociology: 

LSAT 
GMAT 

GRE/Verbal 
GS£/Quant. 


(5.8Z) 
(4.9) 
(3.8) 
(12.4) 


(6.7Z) 
(5.3) 
(3.3) 
(12.6) 


(7.2Z) 
(4.4) 
(4.3) 
(13.0) 


(7.2Z) 

(5.2) 

(3.3) 

(13.0 


(7.0Z) 
(5.0) 
(5.0) 
(15.0) 




Z of Raapondenta: 


2.8Z 


2.6Z 


2.5Z 


1.9Z 


2.2Z 




Political Science^: 














LSAT 
GMAT 

GRE/Verbal 
GRE/Quant. 


(2.0Z) 

1.5 

3.8 
(4.3) 


(1.9Z) 

1.1 

4.9 
(3.5) 


(2.8Z) 

1.1 

3.9 
(4.5) 


(2.0Z) 

1.0 

4.1 
(4.0) 


(1.6Z) 

0.6 

3.5 
(5.0) 




Z\of Reapondents' 


7. OX 


6.6Z . 


6.3Z 


6.6Z 


6.9Z 


f 


Other Social Science: 














LSAT 
GMAT 

GRE/Verbal 
GRE/Quant . 


0.2Z 

0.8 
(0.2) 
(5.0). 


(0.2Z) 
0.6 
(1.2) 
(4.7) 


(1.9Z) 

0.8 

0.0 
(5.4) 


(0.9Z) 
0.2 
(0.2) 
(6.2) 


(0.9Z) 
0.3 
(0.4) 
(7.2) 




Z of Raapondenta: 


4.5Z 


4.7Z 


'4.0a; • 


4.7Z 


5.0Z 





NOTES: 1. Both the LSAT and GMAT dlatln|ulBh between Political Science 
and Govamaent. Tha GRE doaa not. The data hare apply only to 
Political Science. 
g5 .2. The percentage repreaanta the total nua^er of reepondenta 
o Indicating an undergraduate aajor In either Political Science or 

ERJC Govemsent (ayan though tha Caat data doaa not Ineluda Govamaent Majora 

on tha LSAT and. GMAT). . 



Ttit PtrfotMPCc by Pndtrtr«du»f Major 



Table P-^ 



G«Btral Araa: Scianfcc and Mathwaatica 

Table Indicatca the percentage by which the Mean for the aajor Is above 
(belov) the Mean for the unlverae. 



Major & Teat 


1977-78 


1978-79 


1979-80 


1980-81 


1981-8: 


Biolosy/Bloacl. : 












ISAT 
GMAT 

G&£/Verbal 
GRE/Quant . 


4.3Z 
2.3 

"^3.9 
9.5 


4.1Z 
2.5 
0.6 
7.8 


3.5 : 
3.0 
4.7 * 
9.7 


3.9Z 
3.3 
4.7 
8.8 


4.02 
3.3 
5.4 
8.0, 


2 of Reapondenta: 


6.7Z 


7'. IZ 


6.2; 


5.6Z 


5. 72 


Chemistry; 












* LSAT 
GMAT 

GRE/Verbal 
^GRE/Quanx. 


8.0Z 
8.5 
3.8 
21. 


7.8Z 
8.4 
5.1 
21.9 


6.72 
8.0 
2.9 
20.5 


8.1Z 
7.9 
3.5 
20.2 


7.62 
7.5 
2.1 
18.3 


Z of Respondents: 


1.8Z 


1.8Z 


1.7Z 


1.7Z 


1.7Z 


Kathenatics: 












LSAT 
GMAT 

GRE/Verbal 
GRE/Quant , 


13. 8Z 
12.3 
3.0 
29.8 


12.9% 
14.1 
4.1 
29.7 


12.62 
14.0 
3.3 
22.6 


12. 7Z 
13.0 
3.1 
27.9 

• 


12. 8Z 
13.3 
2.7 
26.3 


% oi. Respondents: 


2.2Z 


1.9Z 


1.92 


1.7Z 


1.62 


Engineering^ 












LSAT 
.. GMAT 

GRE/Verbal 
GRE/Quant . 


8.7Z 
9.5 
(6.5) 
27.3 


8. 92 
9.7 
(4.3) 
27.0 


7.82 
10.6 
(6.6) 
27.3 


8.3Z 
9.8 

(7.6) 
26.2 


8.02 
10.0 
(7.3) 
25.1 


X of Respondents: 


5.BZ 


6.1Z 


6.62 


6.9Z 


7.52 


Other Sciences: 












t LSAT 

GMAT 

GRE/Verbal 
GRE/Quant. 


5.9Z 
2.1 
2.9 
17. A 


4.7Z 
1.3 
3.7 
16.9 


2.82 
0.8 
2,9 
17.1 


3.2Z 
0.4 
3.5 
16.0 


2.82 
0.8 
3.5 
14.5 


Z of Reapondenta: 


l.BZ 


2.0Z 


2.12 


2.4Z 


2.32 



MOTES : U. One «ight have plt.eed Engineering in the ''Professional*' 
category, but the pattema of test perfontence ^or Engineering majors 
vere closer to those of Science and Math aajors than to those of other 
« undergraduate professional program najora. 

er|c 8S . 



Tf t FTfoiB>nc« by CndTartduatt Major 



1 



Table V^M 



2.6Z 


2.i: 


0.6Z 


0.9Z 


0.7X 


5.3 
(7.5) 


7.0 
(6.4) 


5.1 
(8.3) 


4.3 
(9.0) 


5c7 
(8.6; 


0.7Z 


0.6Z 


0.6Z 


0.7% 


0.7Z 



Gtntral Arta: Profaiaional -i 

Table Indicates the percantage by wfalrh the Mean for Che aajor la above 
(below) the Mean for the unlveroe. 

Major t Teat 1977-78 18 78-79 1979-80 1980-81 1981-02 

Jonrnallatt ; 

LSAT' 
GMAT 

GRE/Verbal 
GRE/Quant. 

'Z-»f-Reapondents : 

Social Work ; 

LSAT (8.7Z) (7.22) (9.5Z) (8.5Z) (lO.lZ) 

GMAT — ' 

GRE/Verbal (9.>)^^ (9.0) (10.0) (9.5) (9.1) 

GRE/Quant. • (15.3) (14.6) (15.1) (15.4) (15.8) 

Z of Eeapondenta: 0.8Z 0.8Z ^ 0.8Z 0.8Z 0.8Z 

Education I 

^ « • 

LSAT (7.1Z) (6.7Z) (8.0Z) (8.3Z) (8.7Z) 

QIAT (6.1) (6.1) (6.3) (6.3) (4.2) 

GRE/Verbal (12.3) (11.2) (11.7) (10.7) (10.4) 

GRE/Quant. (15.3) (14.6) (15.1) (15.4) (15.8) 

f 

X of Respondents: 7.8Z 7.7Z 7.8Z 7.1Z jS.SZ 

Business 

LSAT. 
GMAT^ 

GRE/Verbal' 
GRE/Quant.' 

Z of Respondents:' 

NOTES 1. The QfAT does not report a separate category fcr Business 
Admins trat Ion, rather disaggregates the field Into eight (B) 
aub-categorles. Of these* I have substituted ''Managewrat** for **Buslness 
Adnlnlstratlon In thle table, and on the grounds that (a) the tejns Is 
often substituted for ''Business Adslnlstratlon** as the title of the 
■ajor In Aaerlcan colleges, and (b) after Accounting, Che number of U.S. 
cltlsens Indicating their aajor on Che GMAT Is higher for Manageacnt 
Chan for any of Che other sub-categories of ^luslness Adslnlstrstlon. 

2. The GRE aggregates all aub-categorles of Business Into a single 
field (the LSAT does not). 

3. The percentage represents Che Cotal nuaber of respondents 
indicating an undergraduate aajor in any field of Businsss covered in 
ocr tables. 



(3.7Z) 


(3.2Z) 


(4.6Z) 


(4;«) 


(4.5Z) 


(7.8) 


(7.4) 


(7.8) 


(7.5) 


(7.7) 


(11.1) 


(11.2) 


(11.7) . 


(9.7) 


(9.1) 


(2.1) 


(1.7) 


(1.2) 


(1.0) 


(2.3) 


19. OZ 


16. 7Z 


18. IZ 


18.8Z 


19.5Z 



ERIC 



87 



TABLE F-13 



Test Performance by Undergraduate Major, ysil'^SBZ : 
2k Halors: Rank by Average Mean Differential 



Rank 

1. 
2. 

3. 
it. 

5. 
6. 

7. 
8. 

9. 
10. 
11. 
12. 
13. 

15. 
16. 

17. 

18, 

19. 
20. 
21, 
22. 

23. 
2k, 



Mathematics 

Economi cs 

Philosophy^ 

Engineering 

Chemistry 

0th, Humanities 

Foreign Langs. 

English 

Anthropology 

Biology 

0th. Sciences 

History 

Psychology 

Journal Ism 

Art G Music 

0th«^Soc. Sci. 

PolJt. Scl. 

Speech 

Business 

Sociology 

Education 

Social Work 



GMAT^ 

Mathematics 
Phi losophy 
Engineering 
Chemistry 
Economi cs 
Engl Ish 
Computer Scj . . 
Foreign Langs. 
History 

0th. Human! ties 

B (6 logy 

0th. Sciences 

Pol It. Sci. 

Psychology 

0th. Soc. Sci. 

Art >& Music 

Sociology 

Education 

Business 



GRE/V 



GRE/Q 



Philosophy Physics 
English Mathematics 
Anthropology Engineering 
History *tomputer Science 

Foreign Langs. Chemistry 



Physics 

0th. Humanities 
Journal ism 
Pollt. Sci. \ 
Biology 
Psychology 
Chemistry 
Other Science 
Mathematics 
Art & Music 
Economics 
Computer Sci . 
0th. Soc. Sci. 
Sociology 
Engineering 
Speech 
Social Vork 
Busine.ss 
Education 



Other Science 
Economics 
Biology 
Phi losophy 
Anthropology 
\Buslnes5 
Psychology 
Foreign Langs. 
Other Humanities 
Pollt. Sci. 
0th. Soc. Sci . 
History 
English 
Art 6 Music 
Journal t sm 
Speech 
Sociology 
Education 
Social Work 



NOTES ; 1. The LSAT does not report separately for Computer Science majors, and the ^ \ 
number of test-takers with a Physics major was too omall to include in the 
tables. 

2. The GMAT does not report separately for Journalism, Social Work, Speech, 
or Anthropology majors, and the number of test-takers with a Physics major 
was too small to include In the tables. 




88 



TABU C 

Faculty v. Cop-^*"*^ t««iii«rit HtceDtiont of Tf t Conf nt v. Bptcificationt 



yreantt Allocattd to Conttnt Catttorin 



IttpoBdontt* liipondtntt^ lupondonti 

Currtot Xdttl Ittm Co«ittee 

CurrlcuXun Curgieulum Clattification Specxficationi 



COMPUTER SCIEWCE 



COWTEHT CATEGORIES 

Software' Systems 
-and Hathodolegy 

Conputer Organisition 
and Logic 

Theory 

Computational 
Mathematica 

Special Topics 



Educational Goals 

Administration and 
Supervisioa of Schools '■ 

Curricultmi 
Development and 
Organisation 

Teaching-Learning 

Measurement, Evaluation, 
and Research 



Analytical Chemistry 
Inorganic Chemistry 
Orcsnic Chemistry 
Physical Chemistry 



40.7* 

20.1 
12.2* 

15.5* 
11.5* 

15.9 
11.1* 

24.4* 
35.4* 

13.4* 

22.3* 
20.3* 
30.7 
26.6* 



33.1 

20.7 
17.7* 

16.6* 
11,9* 
EPUCATIOW 
16.4 

12.3* 

« 

21.1* 
32.6* 

17.6* 

rHFMTSTRY 

22.5* 
22.1* 
28.2 
27.0* 



33.0 

20.1 
20.5 

23.4* 
2.7* 

17,8* 
15.8 

13.6 
33.0* 

19.9* 



Indicated in 
Separate 
Analysis of 
Each 

Sub-Discipline 



35 



20 
20 



20 
5 

15 
15 

15 
40 

15 

15 
25 
30 
30 



^Differs from Comittee Specifications, £f.oa 



Source- Philip K. OltKan. Content Representativenes i the Craduate Record 

Examinations Agreed Tests In Chemistr y, Computer Science andljducaticm. 
"ftS: 1*82. Tablefi.3, .4 and 



erJc 



£9 



TABLE H, Page 1 



TABLE H-1 



0 

c 



Domeitic versus Foreign Student Performance; GRE» •nd C'tATs 
GRADUATE RECORD EXAMINATIONS GRADUATE MANAGEMENT ADMISSIONS TEST 



Pkir C4nt. VERBAL iQljANTITATIVE Per Cent, VERBAL QUANTITATIVE 
Wme»tlc 2 , Domestic^ L ' c L e 

T*st- Domes* DiWies* Test- For. Domes^ For. Oomes^ 



Yeir 


Takers 


Mean 


Mean 


Mean 


Mean 


Takers 


Mean 


Mean 


Mean 


Mean 


1972-3 




497 


500 


512 


510 


H,A, 


N.A. 


N.A. 


N.A.' 


N.A. 


1973-A 




492 


498 


509 


505 


N.A. 


N.A. 


N.A. 


N.A. 


N.A. 






493 


497 


508 


507 


N.A. 


N.A. 


N.A. 


N.A. 


N.A. 


1975-6 


32. S% 


492 


498 


510 


507 


N.A. 


N.A. 


N.A. 


N.A. 


N.A. 


1976-7 


91.3 


490 


495 


515 


50S 


N.A. 


N.A. 


N.A. 


N.A. 


N.A. 


1977-8 


91.1 


484 


491 


518 


512 


0 

80 


18.9 


27.8 


26.5 


27.2 


1978-9 


90.0 


476 


486 


517 


508 


79 


19.6 


27.8 


26.3 


27.0 


1979-80 


89.3 


474 


484 


522 


512 


79 


20.2 


27.8 


27.1 


26.8 


1980-1 


88.1 


473 


483 


523 


513 


80 


20.8 


28.3 


28.0 


26.8 


.1981-2 


86.7 


469 


483 


533 


519 


79 


21.0 


28.5 


28.5 


27.2 



Notes ; 

1. The percentage of U.S. citizens taking the GREs Is based on responses from those 
test-takers who (a) mrere either taking the GRE for the first time or who had taken 
It previously but prior to the beginning of the testing year In questlc>n and 

(b) who filled out the background Information qMestlonnalre. In 198l-19d2, for 
example, that universe was 180,798— or 70.5% of the total number of test-takers. 

2. William Turnbull, former President and currently Distinguished SchoHr-In-ResIdence 
at ETS, bonputed these mean scores— which are approximations based on results of 
regularly scheduled domestic administrations of the GRE. 

3* The percentage of U.S. citizens taking the GMATs Is based solely on respondents to 
the background Information question about country of citizenship. In 1981-82, for 
•xample, sane \2% of the GMAT test-takers did not respond to thU question. 

4. Mean scores for non-U. S. citizens taking the GMATs were deterw.lned by the author 

by disaggregation. I.e. by removing U.S. citizens and non- respondents from the total. 
The results are thus approximations. 

5. Mean scores for respondents Indicating U.S. citizenship are clearly Identified 

In the tables provided to the author courtesy of the Graduate Management Admissions 
CouncI 1. 



.80 



( 



*Chart provided courtesy of the Educational Testing Service 



ERIC 



TMLE H. Pa^ 2 



f72-73 



§73-74 



CU Gaoical TMt feon Tnadti Inciudlm and IxeMHat tlw leoni 
of roralgn Studmta, 1972-79 thrau«h 1981-82* 



TMLE H-Z • 




> 
CD 
f- 

m 
x 



92 



TABLE I TABLE I 

■ ^ 

Co" 

L 

Changes In Number of Test Takers FolloMtng Inclusion of tooal fflrogram.Assessinent 

Uses of the GRE Subject. Area Tests In the National Administrations ln^1969 

1967-8 to 1968-9 1968-9 to 1969-70 1969-70 to 1970-71 

% Change Direction X Change Direction X Change Direction 

In N of of Change In N of of Change In N of of Change 

Test TOstf Takers ' In Mean Score Test-Takers In Nean Score Test-Takers In Mean Sco 



Biology 


6.0% 


Stable 


28. S% 


Down 




Stable 


Chemistry 


0.2 


Down 


12.9 


Stable 


(1.1) 


Up 


Physics 


(8.9) 


Stable 


1 

7.6 


Stable 


(12.3) 


Up 


Geology 


0.8 


Down 


30.9 


Down 


9.2 


Stable 


Mathematics 


0.1 


Down 


12.8 


Down 


(0.3) 


Up 


^ Engineering . 


(9.3) 


Down 


23.3 


Down 


('».6) 


Stable 


Economics 


0.2 , 


Down 


2i».7 


Down 


(0.2) 


Down 


•Pol. Science 


11.9 


Down 


13.2 


Down 


(5.5) 


Down 


Sociology 


. 11.0 


Down 


33.1 


Down 


12.7 


Down 


Psychology 


1t».1 


Down 


27.3 


Down 


9.5 


Stable 


Education 


1ii.6 


Sbble 


36.2 


Down 


16.6 


Down 


History 


, k,2 


Down 


20.0 


Down 


(«.3) 


Down 


Eng. Lit. 


7.7 


Down 




Down 


. (3.1) 


Down 


French 


12.1 


Down 


(2.2) 


Down 


2.7 


Down 


Music 


19.0 

t 


Down 

• 


33.6 


Down 

/ 

r 


6.<» 


Down 

J 








S3 









3- 



ERIC 



Table J 



PvrctnUfie of Graduate Departments with Viable Doctoral Pnxjrams* In 
Selected Fields that Either Required or Recomended the GRE Area Tests 



Table J 



rieiQ 


1Q71 




"\ 

1979 


Criteria for Inclusions 
# of Ph.J).s # of 
Prior 3 Yrs. or Students 


# of Departments 
Included, 1979 


English 


74« 


m 


73« 


10 . 


35 


T20 


French 


73 


76 


76 


5 


15 


49 


History 


• 76 


75 


71 


1 n 

10 




113 


Economics 


78 


75 


69 


10 


30 


99 


Political Science 


73 


71 


60 


10 • 


24 


93 


Sociology 


68 


71 


62 


15 


25 


102 


Psychology 


74 


80 


73 


15 


40 


164 


Chemistry 


74 


78 


83 


10 


,20 


155 ' 


Physics 


84 


90 


92 


.5 


20 


125 


Mathematics 


67 


76 


69 


-.5 


12 


.132 



Source: Graduate Programs and Admissions Hanuals ; Princeton, N.J.: Educational Testing Service.^ Ouadrennlal 
publication. 

*Percentages were determined by counting the number of "viable" doctoral programs and determining the number of 
those programs that either required or reconmended the ORE Subject Area tests. The criteria for "viable" 
are key. There are no objective measures, and each field has to be looked at differently on the basis of the 
data in the Hanuals. 



m 
n 

m 



94 



95 



ERJC. 



U.S. CiUitn/Fortign itudfint Htan Seen x Stltettd IhidtrtraduaU Major: 
CHAT QuaotitAtivc ImiMtion, 1981-82 



U.S. Cieitcni Voii-O.S. Citistnt 

Wtan • Wean* 

Quifititttivc Bated Diciplinaa 

Oba&ittry 31.32 31.09 

CoBputar Sciance 31,34 31.47 

Inginaaring , 34.16 33^7 

Matbamacica 34.78 34.55 

Ftefii" 36.72 34.88 

'i 

IconoBica 29.64 28.32 

Accounting 27.83 27.78 



Hon-Quant i t at ivt Piaciplxnci 

Bngliah 26.02 

Foreign Languagea 25.74 

Biatory 26.37 
faychology ; 25.40 

Political Science 25.81 

Sociology 23.49 



27.37 
27.37 
26.03 
26.03 
25.61 
24.83 



Manageaent 24.58 26.40 



^iaaggregated, therefore apprexiMte. 

- 96 



TABLE K 



U.S. / ! ! / versus Foreign /xx/ Student Means on the fiMAT Quant i tat 



ERIC 




4 

1982 ACT CMiposlU Mmii Scorn for Entoring Frofhmtn flmnlng 

to Nijoriln SoUcttd Dliclpllncs*" 

rroftMlonal/Occupitlonal FItldt SSSH 
Horn Eoonontcs 

Conmunlty Strvtcos (e.g. Soct«1 Wbrk) 16.9 

Industrial » Ttchntcal Fields ^6.9 

Idueatlon ^7.8 
Agriculture 
■usiness Fields 

Health Professions (e.g. Murslng) 19^^ 

Architecture , '-S*** 

Engineering ^^'^ 

Arts > Sciences Fields 

Social Sciences *0*6 

Foreign Languages 20.7 

Letterft (e.g. English, History) 21.0 

•lologlcal Sciences 22.1 

Mathematics ?^*^ 

Physical Sciences 2<i.O 



* AMrlcan College Testing Program. Ceneoe Student Prof Mestt Worms for the 
ACT Assessment. Iowa City, Iowa: ACT, I9B3, PP. M-03. 



99 



f. 

i ■ \ ' 



The ''Conventional Wisdom" Hypothesis: Mean S.cores and Candidate Volume cn Four HRE Subject Area Teists, 196^-1982 



TT 



m 



ni 



II ; 



"^6t , 6E :;7.f /iTC - 71 71 



TITI 



rnr 




ill 



1'!! 



in 



It 



■70, see pp. 19-20 and Ta 



